MULTICS TECHNICAL BULLETIN MTa-253 

Tot OistrlDution 

From* J. Serson 

Date* 02/03/76 

Sublect: Release 2 of the Muffles Sort/Merge 



Attached is information about Release 2 of the Mu! tics 
Sort/Merge? which is scheduled for Multics Release 't.O in June 
1976. There are four write-upSf including sort command, merge 
cofflffland, sort_ subroutinet merge_ subroutine, in the usual form 
for the Multics Progranmers" Manual, and one write-up of 
additional interfaces to be documented in the PLM. 

Comments and criticisms are solicited, whether on technical 
aspects or on the documentation. They may be sent to Joel Berson 
at Honeywell 81 1 I erica by mall or phone; or via "mall Berson 
MSORT" on either the MIT or Phoenix Multics systems. 



Multics Project internal working oocumentat ion. Not to be 
reproduced or distributed outsiae the Multics Prolect. 
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1. li£M_£i2ti£IIQMS 

1.1 A Herge» or file collatlont function has been added. 

1.2 A subroutine interface for both the Sort and Merge has been 
added* 

1.3 Support for the SORT portion of the ANSI COBOL Sort/Merge 
Module, Level 2f has been added. (The C080L MERGE function 
is not supported by this packagel. 

l.if Additional data types for keys and multiple key fields are 
supported. Release 1 supported only character string and a 
single key field. 

1.5 Additional storage media and file organizations are 
supported for the input and otuput files. Essentially any 
file can be supported which can be read or written 
sequentially via iox_ using any available I/O module. 
Release i supported only sequential input and output files 
in the Multics storage system (using tffile_). 

1.6 The following additional user exit points are provided* 

input_record exitt Permits the user to alter* delete* or 

insert records before they enter the 
sorting or merging process. 

output_record exit* Permits the user to alter* delete* 

insert* or sjmnarize records coming 
out of the sorting or merging process 
before they are written to the output 
file. 

1.7 Sequence checking for output records has been added. 
i.8 A file size argument has been added. 

1.9 Command arguments for measurement and testing have been 
added (-time* -nierge_order , and -string^slze) • 
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2. a!iMfi£0>s££ai£:iaaiii2ES 

The kevHord --sort^desc (-sd) must precede the oathname of 
the Sort Description (when the Sort Description is supplied 
in a segment). In Release 1» the pathname of the Sort 
Description must be the first argument and is not preceded 
by a Keyword. 

3. QUESTIONS ABOUT DOCUHENTATION 

I would like to raise the following questions about 
documentation of the Sort/Herge. 

3«1 Should the Sort and the Merge be docunented in four separate 
MPM write-upSf as attached? on should the Merge (command and 
subroutine) be documented in two shorter write-ups which 
then refer to the two Sort write-ups for details? There is 
much in common between the Sort and the Merge. On the other 
hand, noting differences applicable to the Merge in the Sort 
write-ups may be somewhat complicated and confusing. 

3.2 Should there be a separate Users' Guide for the Sort/Merge? 
It soj what information should go in the MPM and what in the 
Users* Guide? Some information not presently in the MPM 
write-ups which might go into a Users' Guide isi 

text of error messages 

description of the report produced by the Sort/Merge 
(various counts of records processed* data produced by 
the -time argument) 

I/O usage? e.g. for PL/I I/O, Fortran, recoro_stream_, 
syn_, etc. 

Relationship between file size, work space required, 
optimization, etc. 

3.3 Should the additional command arguments described in the PLM 
write-up be documented directly in the MPM Commands 
write-ups? 
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Naafi* sort 

The sort command provides a generalized file sorting 
capability* which is specialized for execution by user 
supplied parameters. The basic function of the Sort is to 
read one or more input files of records which are not 
ordered* sort those records according to the values of one 
or more Key fields* and write a single file of ordered (or 
"ranked") records. The Sort has the following general 
capabi I itiesi 

Input and output files may be on any storage nedium and in 
any file organization* 

Very large files* such as multisegment files* can be sorted? 

Multiple Key fields and most PL/I string and numeric data 
types may be specified* 

Exits to user supplied subroutines are permitted at several 
points during the sorting process. 

In addition to arguments to the sort command, other 
information is necessary to specialize the Sort for a particular 
execution. This information, called the Sort Description, can be 
supplied either through the user's terminal or in a segment. 

The description given here of the sort command is sufficient 
for situations where the Sort is free standing* that is, where 
no user supplied procedures are executed. {User supplied 
procedures are called "exit procedures".) Additional information 
is necessary for executing the sort command with exit procedures* 
and is contained in tr\e description of the sort_ subroutine in 
the Multics Programmers' Manual* Subroutines* Section II. 

t 
INPUT AND OUTPUT 

The user can specify the input and output files. In this 
environment* the Sort reads the input files and writes the output 
file. Each input or output file may be stored on any medium and 
in any file organization supported by an I/O module through iox_. 
The I/O module may be one of the Hultics system I/O modules (such 
as tape_ansi_)» or one supplied by a specific installation* or 
one written by a user. An input or output file is specified 
either by a pathname or by an attach description. 

Alternatively* the user can supply either an input_file 
procedure or an output_file procedure (or both). An inout_file 
procedure is responsible for reading input and releasing records 
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to the Sort. ftn output_fi1e procedure is responsible for 
retrieving records (ranked by the Sort) from the Sort and writing 
output. 

In all caseSf records may be either fixed length or variable 
length. 



KEY FIELDS 

The user can specify the key fields to be used in ranking 
records. Key fields are described in the Keys statement of the 
Sort Description. Up to 20 key fields may be specified. Any 
PL/I string or numeric data type - except complex or pictured - 
may be specified for a given key field. Ranking may be 
ascending, descending* or mixed. For a cr>aracter string field* 
the collating sequence is that of the Multics standard character 
set. 

Alternatively* the user can specify a user supplied compare 
procedure, which is then used to rank records. 

The original order of records with equal keys is preserved 
IFIFO order). Original input order Is defined as followst 

1. If two equal records come from different input files, then 
the record fro« the file which is specitied earlier in the 
comffland line is first. 

2# If two equal records come from the same input file* then the 
record which is earlier in the file is first. 



EXITS 

The Sort provides exits to user supplied procedures at 

specific points during the sorting process. Exit procedures are 

named in the Exits statement of the Sort Oescriotion. The 
following exit points are provided* 



input_file To obtain input records and release them 
by one to the sorting process. 



one 



output_file To retrieve ranked records one by one from 
the sorting process and output them. 

input^record To perform special processing for each input 
record* such as deleting, inserting, or 
altering records to be input to the Sort. 
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output_recorcJ To perform special processing for each output 
record* such as deleting* Inserting* or 
altering records to be output from the Sort? 
or sunnarizing data by accumulating it Into a 
summary record, 

compare To compare two records* that is* to ranH them 

for the sorting process. 
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sort -input.spacs output_spec control _args 

where* 

1. input_specs indicates that the user is specifying the 

input files. Up to ID input files may be 
specified. Each input file specification 
{each lnput_spec) may be supplied in one of 
the following forms* 

-input_file pathname 

-if pathname If an input ffile is in the Mul tics 

storage system and its file organization 
is either sequential or Indexed* then it 

may be specified by its pathname. The 

file may be either a single segment or a 

ffiul tisegment file. The star convention 
can not be used. 

An input file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". 

-input_description "attach_desc" 

-ids "attach_desc** If an input file is not in the Mul tics 

storage system or its file organization 
is neither sequential nor indexed* then 
it must be specified by an attach 
description. The attach description 
must be quoted. The target I/O module 
specified via the attach description 
must support the sequential_inout 
opening mode and the iox_ entry point 
read_record. 

Pathnames and attach descriptions can be 
intermixed in the input_specs argument. 

If the user is supplying an inout_file exit 
procedure* then the input_specs argument must 
be omitted and the input_flle exit procedure 
must be named in the Exits statement of the 
Sort Oescription. 
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2. output_spec indicates that the user is specifying the 

output file. Only one output file can be 
specified. The output file specification 
(output_spec) may be supplied in one of the 
foil owing f ormst 

-output_file pathname 

-of pathname If the output file is in the Multics 

storage system and its file organization 
is sequential f then it may be specified 
by its pathname. The file may be either 
a single segment or a multlsegnent file. 

The equals convention may be used. If 
it is* it is applied to the pathname of 
the first input file and the first input 
file must be specified by a pathname* 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". Thus if 
the file does not exist* it will be 
created. If it does exist* it will be 
overwritten. 

-output_file -replace 

-of -rp The output file is to replace the first 

input file. That input file will be 
overwritten during the merge phase of 
the Sort. If -replace is used* the 
first input file «ust be specified by a 
pathname* not by an attach description. 

-output^description "attach_desc" 

-ods "attach_desc" If the output file is not In the Multics 

storage system or its file organization 
is not sequential* then it must be 
specified by an attach description. The 
attach description must be quoted. The 
target I/O module specified via the 
attach description must support the 
sequential^output opening mode and the 
lox. entry point wri te_record. 

If the user is supplying an output_fHe exit 
procedure* then the output_spec argument must 
be omitted and the outout_file exit procedure 
must be named in the Exits statement of the 
Sort Description. 
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3. control_args must be chosen from the following: 

-consol e_input 

-cl indicates that the Sort Oescriptlon is 

read via the I/O switch user_irput 
{which normally is the user's terminal). 

-sort_desc sd^path 

-sd sd_path indicates that the user is specifying 

the pathname of the segraent contairing 

the Sort Description. 

Either the -conso le_ir»put or the -sort_Qesc 
argument - but not both - must be specified. 
See the heading Sort Description below. 

-te(iip_dir td_path 

-td td_path indicates that the user is specifying 

the pathname of the directory which will 
contain the Sort's work files. Tt\e 
equals convention can not be used. 

If this argument is omittedt work files 
win be contained in the user's process 
directory. 

this argument shojid be used when the 
process directory will not be large 
enough to contain the work files. The 
twd3 active function may be used for 
td_path to place work files in the 
user's current worthing directory. 

-file_slze 1 specifies that the total amount of data 

to be sorted is i millions of bytes. 
The argument 1 must be a decimal number. 
If the -fi le_size argument is omitted* 
the default assumption is approximately 
one minion bytes <1 = 1.0). 

This argument Is intended for use when 
some or all of the input files are not 
in the storage system (that is» are not 
specified by pathnames) or when an 
input^file exit procedure is used. In 
these cases the Sort cannot determine 
the amount of input data. (The Sort 
does compute the total amount of input 
data which is in the storage system* 
using segment bit counts.) The 
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-file^size argument may also be used 
when ail of the iiput files are in the 
storage system but records are to be 
inserted or deleted through an 
input_record exit procedure. 

The -file^size argument is used for 
optimization of performance? the actual 
amount of input data can be considerably 
larger without preventing the Sort from 
completing. The maximum amount of data 
which can be sorted is (in bytes) 
approximately 60 million times the 
square root of i. 



NOTES 

Arguments can appear in any order, out a pathname or attach 
description must immediately follow its keyword. 

The temporary directory pathname (td_path) is the name of a 
directory. The Sort Description pathname (sd_path) is the name 
of a segment. 

Any pathname may be relative {to the user's current working 
directory) or absolute. 



Page 10 



sort sort 



^QLl S&sccialiga 

The Sort Oescriptlon contains additional information to 
specialize the Sort for a particular execution. The information 
suppiied may be: 

Keys - Description of one or wore key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures. 

A Sort Description is required. As a minimum, the user must 
specify how records are to be ranked, either by describing key 
fields in the Keys statement or by naming a compare exit 
procedure in the Exits statement. Other information in the Sort 
Description is opptional. 

The Sort Oescriptlon may be supplied as a segment or read 
via the I/O switch user_input (normally the user's terminal). 

If the Sort Description is suppiied in a segment, its 
pathname is specified in the -sort_desc argument. 

If the Sort Description is read via the user's terminal, 
the -console_input argument is used. The Sort prints "InputJ" 
via the I/O switch user_output and waits for input. The user 
then types the Sort Oescriptlon. To terminate the Sort 
Description, the user types a line consisting of a period (".") 
followed by a line feed. (This line is not part of the Sort 
Description.) 



SYNTAX OF THE SORT DESCRIPTION 

A Sort Description consists of a set of statements. Each 
statement must begin with a function keyword. The function 
keyword is followed by the function keyword delimiter colon 
("I"). The statement Itself consists of one or more parameters, 
separated by parameter delimiters. The parameter delimiters are 
spaces, commas (*',"), or (in certain specific cases as specified 
below) parentheses ("(" and ")"). Each statement must end with 
the statement delimiter semicolon ("?"). 

In the descriptions below, certain notational conventions 
are used. A word enclosed between the less than and greater than 
symbols ("<" and ">") is a notational variable, which must be 
replaced by an actual word or phrase of the Sort Description 
language. A word not enclosed between < and > is an actual word 
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of the Sort Description language. A phrase enclosed between 
brackets ("t" and "3"J Is optional. A phrase enclosed between 
braces ("C" and "J") and followed by an ellipsis ("...") is 
required* and may be repeated one or more times. 

KEYS STATEMENT 

The Keys statement specifies key fields used to rank the 
records of the input files. The format of the Keys statement ist 

keys* C<Hey_descript ion>} ... » 

The Keys statement consists of a series of one or more 
<key_descriotion>s. The key descriptions are specified in order, 
the first describing the major key and the last describing the 
most minor key. Up to 23 key descriptions may be supplied. 

A key description is the specification of a single key 
field. The format of a <key_description> is* 

<datatype> (<size>) <position> {descending! 

where! 

1. <datatype> Is the data type of the key field. This 

element is required. See the table below for 
the encoding of <datatype>. 

2. <size> is the size of the key field, expressed in a 

form which depends on the data type. This 
element is required. 

For string data types, <slze> is the length 
(characters or bits) of the field. The 
length is the exact amount of space occupied 
by the field. 

For arithmetic data types, size is the 
precision (binary or decimal digits) of the 
field. Scale factor, if any, must not be 
written (it is not required by the Sort). 
The space occupied is determined by the 
precision in combination with the data type 
and the alignment. (Alignment is specified 
via <position>.) For an aligned binary field 
(fixed or floating), the space occupied is 
increased If necessary to an integral number 
of words. 
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<slze> must be a decimal integer. The unit 
depends on the data type. See the table 
below for the semantics of <size>, (The 
rules used are the same as those usea by 
Hul tics PL/I.) 

3» <position> is the offset of the beginning of the Key 

field, relative to the beginning of the 
record. Consider the record as being aligned 
on a word boundary, as will be the case for a 
Multics PL/I structure. This element is 
required. There are two formats: 

*w* where <w> is the word offset. Words are 

numbered from for the first wora of 
the record. This format specifies to 
the Sort that the key field is aligned 
on a word or <if <w> is even) on a 
double wora boundary. 

<w> (<b>) where <w> is the word portion of the 

offset and <b> is the bit portion of the 
offset; that is, the bit offset within 
the word. Sits are numbered from Q to 
35. This format implies that the key 
field is not aligned on a word boundary. 
If the key field is aligned on a word 
boundary but the user specifies a bit 
offset of anyway, the Sort will 
operate correctly although speed of 
execution may be affected. 

The formats for <position> and the values for 
<«> and <b> are consistent with those shown 
in Hultics PL/I listings or used by debug. 

k, descending specifies descending order for ranking using 
dsc this key field. This element may be omitted? 

the default is ascending order for this key 
field. 
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DATATYPE ENCODING AND SEMANTICS OF SIZE 



Encoding 1 Semantics of <slze> 

of ! (where<s4ze>=n> 

<datatvpe>l Unit Range Space Occupied 



Character string 
(Multics ASCII) 



char 



9 bit 1 - kQ95 n characters 
character 



Bit string 



bit 



1 bit 1 - 1*095 n bits 



Fixed binary 



bin 



1 bit 1 - 71 



Al IgnedJ 

1 £ n < 35* one word 
36 < n i 71» two word 
Unalignedi n ♦ i bits 



Floating binary 



float bin 1 bit 1-63 



A I ignedt 

1 £ n ^ 27t one word 

36 < n < 631 two word. 

Unalignedi n * 9 bit; 



Fixed decimal 
< I eaaing sign) 



dec 



9 bit i - 59 n ♦ 1 digits 
digit 



Floating decimal float dec 9 bit 1-59 n -•■ 2 digits 

digit 



In addition to the forms shown for <datatvpe> in the table 
above* the following variants are also pernittedi 

The following alternate spellings may be useds 

char Icharacter binlblnary decJdeclmal 

The word "fixed" may be used (or onltted)* For example* 
fixed binlbin fixed decldec 

The words may be written in any sequence. For example* 
f I oat binlbin f I oat 
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EXAMPLES OF KEY DESCRIPTIONS 

char(lO), 0(18) Character string, Multics ASCII code, length 

ten characters; starts at bit 18 of word 0. 

char(8)» 1, descending 

Character string, Multics ASCII code, length 
eight characters? starts at bit o of word li 
ranking is descending. 

character<i») , 2» dsc 

Character string, Multics ASCII code, length 
four characters; starts at bit of word 2* 
ranking is descending. 

bit(i6), 0(2) Bit string, length 16 bits; starts at bit 2 

of Mord 0. 

bln(17), 2 Fixed binary, precision 17? since no bit 

offset is specified, is aligned and thus 
occupies one word (equivalent to "bin(35), 
2"). 

bin(i7), 2(18) Fixed binary, precision 17; since a bit 

offset is specified, is unaligned and 
occupies 18 bits; starts at bit l8 of word 2 
(i.e., is in the low order half of word 2). 

bind), 2(G) Fixed binary, precision i; unaligned and thus 

occupies 2 bits; starts at bit of word 2. 

bind), 2 Fixed binary, precision i; aligned and thus 

occupies one word (equivalent to '•bin(35), 
2"). 

bin(36), 2 Fixed binary, precision 36; since no bit 

offset is specified and precision is greater 
than 35 and word offset is even, is aligned 
and occupies two words (equivalent to 
"bin(71), 2"). 

dec(6), 0(9) Fixed decimal, 9 bit digit, precision 6; 

starts at bit 9 of word and occupies 7 
digits including sign (that is, through the 
end of word i). 

float dec(9), 0(9) Floating decimal, 9 bit digit, precision 9; 

starts at bit 9 of word and occupies 11 
digits including exponent and sign (that is, 
through the end of word 2)» 
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EXITS STATEHENT 

An Exits statement specifies the exit procedures to be used 
during execution of the Sort. The format of an Exits statement 
is: 

exitst C<exit_description>> ... 5 

The Exits statement consists of a set of one or more 
<exit_descript ion>s . Exit descriptions may be specified in any 
order. 

An exit description is the specification of one exit point 
and the user supplied exit procedure to be called at that exit 
point. The format of an <exi t_description> IsJ 

<exlt_name> <user_name> 

wherei 

1. <exit_narae> is the keyword naming the exit point at which 

the user supplied exit procedure is to be 
called. Exit names may be chosen from the 
fol lowing listi 



input_f lie 
output_f lie 
input_record 
output_record 
compare 



user_name is the name of the entry point of the user 

supplied procedure. This parameter has the 
same syntax and semantics as a command name. 
That isJ 

User_name can be either a segment name (e.g.« 
segment) or a segment name and an entry point 
name (e.g.t segment$entry_polnt) . In these 
cases* the user's current search rules are 
applied to find the procedure. (If some 
segment Is already kriown by the specified 
reference name» that segment is used.) 

User_naffle can also be a pathname? that is* 
can specify a directory hierarchy location* 
either relative (to the user's ctrrent 
working directory) or absolute. In this 
case* the search rules are not applied and 
the pathname is used to find the procedure. 
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<If some other segnsent is already Known by 
the specified reference name* that segment is 
terminated first.) 



WRITING EXIT PROCEDURES 

The exit points to be used during an execution of the Sort 
and the names of the corresponding user supplied exit procedures 
are specified in the Exits statement as described above. The 
specifications for writing exit procedures (PL/I declare and call 
statements) and the functional requirements imposed upon exit 
procedures are given in the description of the sort_ subroutine 
in Section II of MPM Subroutines. 
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sort -input_tile sort. in -output_file =«out -consol e_input 

Input. 

Key! char(io), o» 

. • 

In this exafljplet the arguraents of the coflimand stete that 
there is one input file, whose pathname is sort. in? the output 
file pathname is sort. out; the Sort Description is input via the 
user's terminaj; and by default the worK files are contained in 
the user's process directory. 

The Sort Description states that there is one Hey, a 
character string of length 10 characters, starting at word bit 
of the record. There are no exits specified. 

sort -teinp_dir >uad>pool -sort_desc sd 

In this example the arguments of the command state that the 
worK files are contained in the directory >udd>pooi; and the 
Sort Description is contained in the segment named sd. 

Assume that the segment sd contains? 

Keys: fixed bin(35> 0, char(8) i; 
exits! input_file user$input, 
output_file userSoutput; 

The Sort Description states that there are two Keys. The 
major Key is an aligned fixed binary field of precision 35, 
contained in Mord of the record. The minor Key is a character 
string of length 8, contained in words 1 and 2 of the record. 

There are two exits, an lnput_file procedure exit and an 
output_file procedure exit. The input_file exit procedure entry 
point is named userSinput; the output_file exit procedure entry 
point is named user$output. These exits must be specified 
because the command did not specify either an input file or an 
output file. 

sort -if sort_in -of -replace -td twdJ -sd sort_desc 

In this example the arguments of the command state that the 
input file is named sort_in; the output file is to replace the 
input file? worK files are contained in the user's current 
worKlng directory; and the Sort Description is contained in the 
segment sort_desc. 
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sort -input ..description "tape_ansi_ tfol.l -name a" -if b \ 
-output_descript ion "vflle_ c -extend" -ci 

In this example there are two input files. The first input 
file is specified by an attach descriotion for the I/O module 
tape_ansi_ with the attach argument "vol_i -name a"« The second 
input file is specified by the pathname b, and thus must be a 
sequential or indexed file in the storage system. The output 
file is specified by an attach description for the I/O moaule 
vfile_ with the attach argument "b -extend". For the I/O module 
vfile_» this means that the pathname is c and the file is to be 
extended; that is« output records from the Sort will be written 
at the end of the file c (if it already exists). 

(A \ followed by a line feed is used to continue the command 
arguments onto the second line.) 

The Sort Description (not shown) will be read via the user's 
terminal. 

sort -ids **record_stream_ -target vfile_ a" -of b -ci 

In this example assume that the input file is an 
unstructured file in the storage system, with the pathname a. 
The input file has been specified by an attach description using 
the I/O module record_streaffl_, which will transform the record 
I/O operations requested by the Sort into the appropriate stream 
I/O operations for the target file a. 



sort -ids "syn.. user..switchname" -of b -ci 

In this example the input file is attached using the I/O 
module syn_ to the I/O switch user_switchname, which must be 
attached and closed. 
Maj2£' merge 

The merge command provides a generalized file merging 
capability, which is specialized for execution by user supplied 
parameters. The basic function of the Merge is to read one or 
more input files of records which are in order according to the 
values of one or more key fields, merge (collate) those records 
according to the values of those key tie! as, and write a single 
file of ordered (or "ranked") records. The Merge has the 
following general capabi I itiest 
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Input and ouTout files may be on any storage medium and In 

Very large filest such as multisegment tiles* can be merged* 

Multiple key fields and most PL/I string and numeric data 
types may be specified? 

Exits to user supplied subroutines are permitted at several 
points during tne merging process. 

In addition to arguments to the merge command* other 
information is necessary to specialize the Merge for a particular 
execution. This information, cal led the Merge Description* can 
be supplied either through the user's terminal or in a segment. 

The description given here of the merge command is 
sufficient for situations where the Merge is free standing; that 
is, where no user supplied procedures are executed. (User 
supplied procedures are called "exit procedures".) Additional 
information is necessary for executing the merge command with 
exit procedures, and is contained in t^e description of the 
merge_ subroutine in the Hultics Programmers" Manual, 
Subroutines* Section II. 



INPUT ANO OUTPUT 

The user specifies the input and output files. The Merge 
reads tne input files and writes the output file. Each input or 
output file may be stored on any medium and in any file 
organization supported by an I/O module through iox_. The I/O 
module may be one of the Multics system I/O modules (such as 
tape_ansl_) * or one supplied by a specific Installation, or one 
written by a user. An input or output file is specified either 
by a pathname or by an attach description. 

In all cases* records may be either fixed length or variable 
I ength. 

KEY FIELOS 

The user can specify the key fields to be used in ranking 
records. Key fields are described In the Keys statement of the 
Merge Description. Up to 2C key fields may be specified. Any 
PL/I string or numeric data type - except complex or pictured - 
may be specified for a given key field. Ranking may be 
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ascending, descending* or mixed. For a character string field* 
the collating sequence is that of the Multlcs standard character 
set. The records of each input fiie must be in order according 
to those Key fields. 

Alternatively* the user can specify a user supplied compare 

procedure* which is then used to rank records. The records of 

each input file must be in order according to the algorithm of 
that procedure. 

The original order of records with equal keys is preserved 
(FIFO order). Original input order is defined as foMowss 

1. If two equal records come from different input files* then 
the record from the file which is specified earlier in the 
command line is first. 

2. If two equal records come from the sa«ie input file* then the 
record which is earlier in the file is first. 



EXITS 

The Merge provides exits to user supplied procedures at 
specific points during the merging process. Exit procedures are 
named in the Exits statement of the Merge Description. The 
following exit points are providedi 

output_record To perform special processing for each output 
record, such as deleting* inserting, or 
altering records to ba output from the Merge; 
or summarizing data by accumulating it into a 
summary record. 

compare To compare two records* that is, to rank them 
for the merging process. 
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merge input^specs output_spec control _args 
where* 

1. input_specs indicates that the user is specifying the 

input files. Up to tO input files may be 
specified. Each input file specification 
(each input_spec) may be supplied in one of 
the following formst 

-input_file pathname 

-if pathname If an input file is in the Multlcs 

storage system and its file organization 
is either sequential or indexed, then it 
may be specified by its pathname. The 
file may be either a single segment or a 
multisegment file. The star convention 
can not be used. 

An input file specified by a pathname 
Mill be attached using the attach 
description ""vflle. pathname". 

-input_descriDt ion "attach_desc*' 

-ias "attach_desc'" If an input file is not in the Multics 

storage system op its file organization 
is neither sequential nor indexed, then 
it must be specified by an attach 
description. The attach description 
must be quoted. The target I/O module 
specified via the attach description 
must support the sequentlal_input 
opening mode and the iox_ entry point 
read_recorci. 

Pathnames and attach descriptions can be 
intermixed in the input_specs argument. 

2. output_spec indicates that the user is specifying the 

output file. Only one output file can be 
specified. The output file specification 
{output_spec> may be suplied in one of the 
f ol lowing forms i 

-output_file pathname 

-of pathname If the output file is in the Multics 
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storage system and Its file organization 
is sequential, then it may be specified 
by its pathname. The file may be either 
a single segment or a niu tti segment file. 

The equals convention may be usea. If 
it is, it is applied to the pathname of 
the first input file and the first irput 
file must be specified by a pathname, 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". Thus if 
the file does not exist, it will be 
created. If it does exist, it will be 
overwritten. 

-output_description "attach^desc" 

-ods "attach_desc" If the output file is not in the Mul tics 

storage system or its file organization 
is not sequential, then it must be 
specified by an attach description. The 
attach description must be quoted. The 
target I/O moduie specified vis the 
attach description must support the 
3equential_outout opening mode and the 
iox_ entry point *iri te_record. 

3. control_args must be chosen from the fol lowing! 

-console_lnput 

-ci indicates that the Merge Description is 

read via the I/O switch user_input 
twhich normally is the user's terminal). 

-merge_desc md_path 

-md md_path indicates that the user is specifying 

the pathname of the segment containing 

the Merge Description. 

Either the -conso le_lnput or the -merge_aesc 
argument - but not both - must be specified. 
See the heading Merge Description below. 
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NOTES 

Arguments can aopear in any ordert but a oathname or attach 
description must immediate ly follow its keyword. 

Tne Merge Oescriotion pathname (md_path> is the name of a 
segment. 

Any pathname may be relative (to the user's current working 
directoryj or absolute. 
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The Merge Description contains additional information to 
specialize the Merge for a particular execution. The information 
supplied may be: 

Keys - Description of one or more key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures. 

A Merge Description is required. As a minimum, the user 
must specify how records are to be ranked, either by describing 
key fields in the Keys statement or by naming a compare exit 
procedure in the Exits statement. Other information in the Merge 
Description is optional. 

The Merge Description may be supplied as a segment or read 
via the I/O switch user_input (normally the user's terminal). 

If the Merge Description is supplied in a segment, its 
pathname is specified in the -raerge_desc argument. 

If the Merge Description Is read via the user's terminal, 
the -consol e_input argument is used. The Merge prints "Inputs" 
via the I/O switch user_output and waits for input. The user 
then types the Merge Description. To terminate the Mer§e 
Description, the user types a line consisting of a period {**."» 
followed by a line feea. (This line is not part of the Merge 
Description. ) 



SYNTAX OF THE MERGE DESCRIPTION 

A Merge Description consists of a set of statements. Each 
statement must begin with a function keyword. The function 
keyword is followed by the function keyword delimiter colon 
("I"). The statement itself consists of one or more parameters, 
separated by parameter delimiters. The parameter delimiters are 
spaces, commas ("»"), or (in certain specific cases as specified 
below) parentheses ("(" and ")"). Each statement must end with 
the statement delimiter semicolon ("j"). 

In the descriptions below, certain notational conventions 
are used. A word enclosed between the less than and greater than 
symbols ("<" and ">") is a notational variable, which must be 
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replaced by an actual word or phrase of the Merge Oescrlotlon 

of the Merge Description language. A phrase enclosed between 
brackets ("t" and "]") Is optional. A phrase enclosed between 
braces ("C" and ">") and followed by an ellipsis ("..."l is 
reauiredt and may be repeated one or more times. 

KEYS STATEMENT 

The Keys statement specifies Key fields used to rank the 
records of the input files. The format of the Keys statement is: 

keyst C<key_de3cription>J ... ; 

The Keys statement consists of a series of one or more 
<key_description>s. The key descriptions are specified in order, 
the first describing the sajor key and the last describing the 
most minor key. Up to 20 key descriptions nay be supplied. 

A key description is the specification of a single key 
field. The format of a <key_descriptlon> 1st 

<datatype> (<size>) <position> (descending! 

where* 

1. <datatYpe> is the data type of the key field. This 

element is required. See the table below for 
the encoding of <datatype>. 

2. <size> is the size of the Key field. This element 

is required. 

For string data types, <size> is the length 
(characters or bits) of the field. The 
length is the exact amount of space occupied 
by the field. 

For arithmetic data types, <size> is the 
precision (binary or decimal digits) of the 
field. Scale factor, if any, must not be 
written (it is not required by the Merge). 
The space occupied is determined by the 
precision in combination with the data type 
and the alignment. (Alignment is specified 
via <posltion>.> For an aligned binary field 
(fixed or floating), the space occupied is 
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increased If necessary to an Integral number 
of Mords. 

<slze> must be a decimal integer. The unit 
depends on the data type* See the table 
below for the semantics of <size>. (The 
rules used are tt\e same as those used by 
Hul tics PL/I.) 

3. <position> is the offset of the oeginning of the key 

fieid« relative to the beginning of the 
record. Consider the record as being aligned 
on a word boundary* as wi 11 be the case for a 
Hultics PL/I structure. This element is 
required. There are t**o formats* 

<«!> where <w> is the word offset. Words 

are numbered from for the first word 
of the record. This format specifies to 
the Merge that the key field is aligned 
on a word or Cif <w> is even) on a 
double word boundary. 

<w> (<b>) where <w> is the word portion of the 

offset and <b> is the bit portion of the 
offset; that is, the bit offset within 
the word. Bits are numbered from o to 
35. This format implies that the key 
field is not aligned on a word boundary. 
If the key field is aligned on a word 
boundary but the user specifies a bit 
offset of anyway* the Merge will 
operate correctly although speed of 
execution may be affected. 

The formats for <positlon> and the values for 
<w> and <b> are consistent with those shown 
in Multics PL/I listings or used by debug. 

km descending specifies descending order for ranking using 
dsc this key field. This element may be omitted* 

the default is ascending order for this key 
field. 
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DATATYPE ENCODING AND SEMANTICS OF SIZE 



Encoding I Semantics of <sl2e> 
of ] (where <slEe> = n) 
<datatvpe>l Unit Range Space Occuoled 



Character string 
(Multics ASCII) 



char 



9 bit 1 - itOgS n characters 
character 



Bit string 



bit 



1 bit 1 - if095 n bits 



Fixed binary 



bin 



1 bit 1 - 71 



A I ignedi 

1 < n < 35J one wore 

36 < n i 711 two wore 

Unaligned* n ♦ i bits 



Floating binary 



float bin 1 bit 1-63 



Al ignedi 

1 < n < 271 one wore 
36 < n < 63* two or< 
Unaligned! n ♦ 9 s 



Fixed decimal 
( leading sign) 



dec 



9 bit 1-59 
digit 



n ••■ 1 digits 



Floating decimal 



float dec 9 bit 1-59 
digit 



n + 2 digits 



In addition to the forms shown for <datatyoe> in the table 
abovet the following variants are also permitted! 

The following alternate spellings may be used! 

char Scharacter binlbinary decl decimal 
The word "fixed" may be used (or omitted)* For example! 

fixed binloin fixed decldec 

The words may be written in any sequence. For example! 
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fJoat binlbin float 



(END) 
Page 29 



merge merge 



EXAMPLES OF KEY DESCRIPTIONS 

char(iG), 0(18) Character string, Multics ftSGII code, length 

ten characters? starts at bit 18 of word Q, 

char(8)t 0, descending 

Character string, Multics ASCII code, length 
eight characters! starts at bit fl of word O; 
ranKing is descending* 

character («+) , fl, dsc 

Character string, Multics ASCII code, length 
four characters? starts at bit of word 0? 
ranking is descending. 

bit(l6), 0(2) 3it string, length 16 bits? starts at bit 2 

of word 0. 

0in(i7), 2 Fixed binary, precision 17? since no bit 

offset is specified, is aligned and thus 
occupies one word (equivalent to "bin(35), 
2") . 

bind/), 2(181 Fixed binary, precision 17? since a bit 

offset is specified, is unaligned and 
occupies 18 bits? starts at bit 18 of word 2 
(l,e«, is in the low order half of word 2), 

bind), 2(0) Fixed binary, precision i? unaligned and thus 

occupies 2 bits? starts at bit of word 2. 

bin(i), 2 Fixed binary, precision l? aligned and thus 

occupies one word (equivalent to "bin(35), 
2"). 

bin(36), 2 Fixed binary, precision 3&; since no bit 

offset is specified and precision is greater 
than 35 and word offset is even, is aligned 
and occupies two words (equivalent to 
"bin(71), 2">. 

dec(6), 0(9) Fixed decimal, 9 bit digit, precision 6? 

starts at bit 9 of word and occupies 7 
digits including sign (that is, through the 
end of word 1). 

float dec(9), 0(9) Floating decimal, 9 bit digit, precision 9? 

starts at bit 9 of word and occupies 11 
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digits including exponent and sign (that is 
through the end of word 2). 
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EXITS STATEMENT 

An Exits statement specifies the exit procedures to be used 
during execution of the Merge. The format of an Exits statement 
ist 

exitst C<exi t_description>} ••• ? 

The Exits statement consists of a set of one or more 
<exit_description>s . Exit descriptions may be specified in any 
order. 

An exit description Is the specification of one exit point 
and the user supplied exit procedure to be called at that exit 
point. The format of an <exit_descrlpt ion> isi 

<exit_name> <user_na(Be> 

where! 

1. <exit_name> is the keyword naming the exit point at which 

the user supplied exit procedure is to be 
called. Exit names may be chosen from the 
f ol lowing I ist» 

output_record 
compare 

2. user_name is the name of the entry point of the user 

supplied procedure. This parameter has the 
same syntax and semantics as a command name. 
That isi 

User_name can be either a segment name (e.g.* 
segment) or a segment name and an entry pclnt 
naffle (e.g.* segment$entry_point) . In these 
cases* the user's current search rules are 
applied to find the procedure. (If some 
segment is already known by the specified 
reference name* that segment is used.) 

User_name can also be a pathname? that is» 
can specify a directory hierarchy location, 
either relative (to the user's current 
working directory) or absolute. In this 
case, the search rjles are not applied and 
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the pathname is used to find the procedure. 
(If some other segment is already known by 
the specified reference name* that segment Is 
terminated first.* 

WRITING EXIT PROCEDURES 

The exit points to be used during an execution of the Merge 
and the names of the corresponding user supplied exit procedures 
are specified in the Exits statement as described above. Tne 
specifications for writing exit procedures (PL/I declare and call 
statements) and the functional requirements imposed upon exit 
procedures are given in the description of the merge subroutine 
in Section II of MPM Subroutines. 
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merge -if merge. in_i -if nierge«in_2 -output^file =.out -ci 

Input. 

Key! char(io), 0? 

• 

In this examplet the arguments of the com«and state that 
there are two input files* whose pathnames are merge. in_i and 
merge. in_2; the output file pathname is merge. out? and the 
Merge Description is input via the user's terminal. 

The Merge Description states that there is one key» a 
character string of length 10 characters* starting at word bit 
of the record. There are no exits specified. 

merge -input_file in_l -if in_2 -of oat_i -merge_desc rad 

In this example* the arguments of the command state that the 
input files are named in_l and in_2; the output file is named 
out_i; and the Merge Description Is contained in the segment 
named md. 

Assume that the segment md contains* 

keyst fixed bin{35) 0* char{8) i; 
exits* output_record userSoutput; 

The Merge Description states that there are two keys. The 
major key is an aligned fixed binary field of precision 35* 
contained in word of the record. The minor key Is a character 
string of length 8, contained in words 1 and 2 of the record. 

There is one exit, an output_record procedure exit? the 
output_record exit procedure entry point is named userSoutput. 

merge -input_descr ipt ion "tape_ansi_ yol_l -name a" -if b \ 
-output_descrlptlon "vflle_ c -extend" -ci 

In this example* there are two input files. The first input 
file is specified by an attach description for the I/O module 
tape_ansi_ with the attach argument "wol_l -name a". The second 
input file is specified by the pathname b* and thus must be a 
sequential or indexed file in the storage system. The output 
file is specified by an attach description for the I/O module 
vfile with the attach argument "c -extend". For the I/O module 
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vfile_, this means that the pathname is c and the file Is to be 
extended; that iSt output records from the Merge will be written 
at the end of the file c (if it already exists). 

<A \ foiioMed b/ a line feed is used to continue the comrrand 
arguments onto the second line.) 

The Merge Description (not shown) will be read from the 
user's terminal. 



merge -ids ""record_stream_ -target vfHe_ a" \ 

-Ids "syh^ user_swltchnaffle** -of c -console_inDut 

In this example, assume that the first input file is an 
unstructured file in the storage system, with the pathname a. 
This input file has been specified by an attach description using 
the I/O module record^stream^, which will transform the record 
I/O operations requested by the Merge into the appropriate stream 
I/O operations for the target file a. The second input file is 
attached using the I/O module syn_ to the I/O switch 
user^switchname, which must be attached and closed. 
Nam e! sort_ 

The sort_ subroutine provides a generalized file sorting 
capability, which is specialized for execution by user supplied 
parameters. The basic function of sort_ is to read one or more 
input files of records which are not ordered, sort those records 
according to the values of one or more Key fields, and write a 
single output file of ordered (or "ranked") records. The sort_ 
subroutine has the following general capaoil 1 ties* 

Input and output files may be on any storage medium ana in 
any tile organization? 

Very large files, such as multlsegment files, can be sorted? 

Multiple key fielas and most PL/I string and numeric aata 
types may be specified? 

Exits to user supplied subroutines are permitted at several 
points during the sorting process. 

The arguments to the sort_ subroutine include one or more 
pointers to additional information necessary to specialize sort_ 
for execution. This additional information is called the Sort 
Description. 
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INPUT AND OUTPUT 

The user can specify the input and output files. In this 
environment, the Sort reads the input files and writes the output 
file. Each input or output file may be stored on any mediuff and 
in any file organization supported by an I/O module through i ox_« 
The I/O module may be one of the Multics system I/O modules (such 
as tape_ansi_), or one supplied by a specific installation, or 
one written by a user. An input or output file is specified 
either by a pathname or by an attach description. 

Alternatively, the user can supply either an input_file 
procedure or an outpjt_file procedure (or aoth). An inpu1_file 
procedure is responsible for reading Input and releasing records 
to the Sort. An output_file procedure is responsible for 
retrieving records (ranked by the Sort) from the Sort and writing 
output . 

In all cases, records may be either fixed length or variable 
I ength. 



KEY FIELDS 

The user can specify the key fields to be used in ranking 
records. Key fields are described in the Keys statement - or in 
the keys structure - of the Sort Description. Up to 20 key 
fields may be specified. Any PL/I string or numeric data type 
except complex or pictured - may be specified for a given key 
field. Ranking may be ascending, descending, or mixed. For a 
character string key field, the collating sequence is that of the 
Multics standard character set. 

Alternatively, the user can supply a compare procedure, 
which is then used to rank records. 

The original inout order of records with equal keys is 
preserved (FIFO order). Original input order is defined as 
foil owsi 

1. If two equal records come from different input files, then 
the record from the file which is specified earlier in the 
list of input files (in the input^specs subroutine argument) 
is first. 

2. If two equal records come from the same input file, then the 
record which is earlier in the file is first. 
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EXITS 

The Sort provides exits to user sjpplied procedures at 
specific points during the sorting process. Exit procedures are 
named in the Exits statement - or in the exits and io_exits 
structures - of the Sort Description. The foHowing exit points 
are providedJ 



input_f i le 
output_fi le 
input_record 



To obtain input records and release them one 
by one to the sorting process. 

To retrieve ranked records one by one from 
the sorting process and output them. 

To perforin special processing for each input 
record* such as deleting, inserting* or 
altering records to be input to the Sort. 



output_record To perform special processing for each output 
record* such as deleting* inserting* or 
altering records to be output from the Sort; 
or summarizing data by accumulating it into a 
summary record. 



compare 



To compare two records? that is, to ranK them 
for the sorting process. 



Details of exit procedures are given below under the heading 
Writing Exit Procedures. 
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del sort_ entry < (*Jchar(*)-, chart*), (♦)ptr, char(*)» 

charC*), float bin(27>, fixed bln(35)); 

call 3ort_ (input_specsf output _speCf sort_desc» temp.dirt 

user_out_sH, fl le_size» code! ' 

whereJ 

1. input_specs is an array containing the specifications of 

the input files. Up to lO input files may be 
specified. The array extent specifies the 
number of input files. (Input) 

Input file 1 is specified in the array 
element input_specs( 1) » in one of the 
f ol lowing f orms t 

-input_file pathname 

-if pathname If an input file is in the Multics 

storage system and its file organization 
is either sequential or indexed, then it 

may be specified by Its pathname. The 

file may be either a single segment or a 

multisegment file. The star convention 
can not be used. 

An input file specified by a pathname 
Mill be attached using the attach 
description **vflle_ pathname". 

-input_descriotion attach_desc 

-ids attach_desc If an input file is not in the Multics 

storage system or its file organization 
is neither sequential nor indexed, then 
it must be specified by an attach 
description. The target I/o module 
specified via the attach description 
roust support the sequenti al_input 

opening mode and the lox_ ertry point 
read_record. 

Pathnames and attach descriptions can be 
intermixed in the lnput_specs array. 

If the user is supplying an input_file exit 
procedure, then input^specsd) , the first 
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input file specif icationf must be "" <the 
array extent should be D and the inout^flle 
exit procedure must be named in the io_exits 
structure of the Sort Description. 

2. output_spec is the specification of the output file. 

Only one output file may be specified. 
(Input) 

The output file may be soecified in one of 
the following forms! 

-output^file pathname 

-of pathname If the output file is in the Multics 

storage system and its file organization 
is sequential* then it may be specified 
by its pathname. The file may be either 
a single segment or a multisegment file. 

The equals convention can be used. If 
it ls» it is applied to the pathname of 
the first input file and the first input 
file must be specified by a pathname* 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". Thus if 
the file does not exist* it will be 
created. If It does exist, it will be 
overwritten. 

-output_file -replace 

-of -rp The output file is to replace the first 

input file. That input file will be 
overwritten during the merge phase of 
the Sort. If -replace is used, the 
first input file Must be specified by a 
pathname, not by an attach description. 

-output_description attach_desc 

-ods attach_desc If the output file is not in the Multics 

storage system or its file organization 
is not sequential, then it must be 
specified by an attach description. The 
target I/O module specified via the 
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attach description must support the 

<;pniJpr> t ] a I mi^mi^ nnAnlnn mt^^^ — _ ^ ^1 

iox_ entry point wri te_record. 

If the user is supplying an output_file exit 

procedure, then the output_spec argument must 

be "•■ and the output.fi !e exit procedure must 

be named in the io_exits structure of the 
Sort Description. 

3. sort_desc is an array of pointers to the Sort 

Description. See the heading Sort 

Description beioM. (Input) 



k. terop_dir is the pathname of the directory which 

contain the Sort's work files. (Input) 



Mil I 



If this argument is "", then work files will 
be contained in the user's process directory. 

This argument should be used when the process 
directory will not be large enough to contain 
the work files. The get_wdir_ function may 
be used to obtain the name of the user's 
current working directory, 

5. user_out.sw specifies the destination of both the summary 

report and diagnostic messages for errors 
detected in the arguments to sort_ or in the 
Sort Description. (Input) 

This argument may have the following values* 

= write the summary report and 
diagnostic messages via the I/O 
switch user_output. 

"-bf" = do not write the summary report 
and diagnostic messages. If any 
errors are diagnosedt sort_ will 
return with the status" code 
bad.arg but information about 
the number and nature of the 
errors is not available. 

switchname - write the summary report and 
diagnostic messages via the I/O 
switch named switchname. The 
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switch must be attached and open 
for strean output. 

6. fiie_size is the total amount of data to be sortedi in 

millions of bytes. If this argument is zero* 
the default assumption is approximately one 
million bytes (file_size = l.Q>. (Input) 

This argument is Intended for use when some 
or all of the input files are not in the 
storage system (that ISf are not specified by 
pathnames) or when an input_file exit 
procedure is used. In these cases the Sort 
cannot determine the amount of input data. 
{The Sort does compute the total amount of 
input data which is in fhe storage svstem» 
using segment bit counts.) The file_size 
argument may also be used when all of the 
Input files are in the storage system but 
records are to be inserted or deleted through 
an input_record exit procedure. 

The file_slze argument is used for 
optimization of performance? the actual 
amount of data can be considerably larger 
without preventing the Sort from completing. 
The maximum amount of data which can be 
sorted is (in bytes) approximately 60 million 
times the square root of file_size. 

7. code is a standard Multics status code returnee oy 

sort_. Possible values are listed below 
under the heading Status Codes. (Output) 



NOTES 

The temporary directory pathname (temp_dir argument) is the 
name of a directory. 

Any pathname may be relative (to the user's current working 
directory) or absolute. 



STATUS COOES 

The following status codes may be returned by sort_ (all 
codes are in error_table_) t 
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Normal return (no errorsi • 

bad_arg One or more arguments specified to sort_, 

including those In the Sort Description, was 
invalid or inconsistent. The Sort Mill have 
previously written diagnostic messages as 
directed by the user_out_sw argument. The 
sorting process itself has not been started. 

fatal_error The Sort has encountered a fatal error during 

the sorting process. The Sort will have 
oreviously generated a specific error message 
and signalled the sub_error_ condition via 
the sub_err_ subroutine. 

out_of_sequence The call to sort_ is not in the sequence 

required by the Sort* that is, sort_ has 
been called after Initiation of the Sort but 
before termination of that invocation. 
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S2£l aescrlaiiap 

The Sort OescPiption contains additional information to 

specialize the Sort for a particular execution. The Sort 

Description is specified via the sort_desc argument to sort_. 

The information specified may bei 

Keys - Description of one or more Key fields used for 
ranhing records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures. 

A Sort Description Is required. As a minimum* the user must 
specify how records are to be ranked* either by describing Key 
fields in the Keys statement or by naming a compare exit 
procedure in the Exits statement. Other information in the Sort 
Description is optional. 

The Sort Description may be supplied to sort_ in either of 
two forms* called source form and internal form. 

The source form of ft\e Sort Description is written exactly 
as specified for the sort command (see the Multlcs Programmers' 
Hanual* Commands and Active Functions* Section III), and is 
stored as an ASCII segment? that is* as an unstructured file in 
the Multlcs storage system. If source form is used, then the 
sort_desc argument to 3ort_ must have an array extent of i and 
the one pointer must be a pointer to the segment. <The segment 
must contain only the Sort Description.) The source form is 
useful when the user writes the Sort Description and supplies it 
to the procedure which calls sort_. 

The internal form of the Sort Description is a set of one to 
three structures. The sort_desc argument must have an array 
extent of 3, and the three pointers are poifiters to the three 
structures. Any of the structures can be omitted. In that case 
the corresponding pointer must be null. The pointers must be 
specified in the array in the following order: 

addr(keys) 
addr (exits) 
addr(io_exits) 

where the three structures (keys, exits* and io_exits) are 
defined below. The internal form is useful when the proceaure 
calling sort_ constructs the Sort Description. 
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KEYS STRUCTURE 

The keys structure is used when the caller describes Key 
fields. The Sort's standard compare routine will then be used to 
rank records. If thQ caller describes keys» then the compare 
exit must not be specified. 

If the caller does not describe keys* then the corresponding 
pointer in the array sort_desc must be null and the cojnpare exit 
must be specified in the exits structure* The user supplied 
compare routine will then be used to rank records. 

The keys structure is* 

del 1 keys* 

2 version fixed bin lnlt(l)» 

2 number fixed bin* 

2 key_desc(user_keys_number ref er(keys. numberH , 

3 datatype char(8>* 

3 size fixed bin<2tt), 

3 word_offset f ixed bin< 18) » 

3 bit^offset fixed bln(6)* 

3 desc char(3>i 

where! 

1. version is the version number of the structure (must 

be 1) . 

2. number is the number of key fields* established by 

the value of user_keys_number. 

3- key_desc is an array of key descriptions. Each key 

description is one elenent of the array. The 
key descriptions must be specified in order* 
the major key first and the most minor key 
last. 

hm datatype is the data type of the key field. See the 

table belOM for the encoding of datatype. 
The value must be left Justified within 
datatype. 

5. size is the size of the key field. In units which 

depend on the data type. 

For string data types, size is the exact 
length (characters orbltsi of the field. 
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For arithmetic data types, size is the 
precision (binary or decimal digits) of the 
field. The space occupied is determined by 
precision in combination with the data type. 
The space occupied is not adjusted for an 
aligned field. For example, for an aligned 
fixed binary field of one word, size must be 
specified as 35; for an aligned floating 
binary field of two words, size must be 
specified as 63. See the table below for the 
semantics of size. 

6. Hord_offset is ihe word portion of the offset of the 

beginning of the key field, relative to the 
beginning of the record. Consider the record 
as being aligned on a word boundary, as will 
be the case for a Multics PL/I structure. 
Words are numbered fron for the first word 
of the record. 
II 

7. bit_offset is the bit portion of the offset of the Key 

field; that is, ih& bit offset within the 
word in which the key field begins. Bits are 
numbered from to 35. (If the field is 
aligned on a word boundary, then bit_offset 
is O.i 

8. desc indicates whether ranking for this key field 

is to be ascending or descending. Possible 
values aret 

•••• = use ascending ranking. 

"dsc" = use descending ranking. 
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DATATYPE ENCODING AND SEHANTICS OF SIZE 



Encoding I Semantics of size 

of I (where size = n) 
datatype : Unit Range Space Occupied 



Character string char 
(Multics ASCII) 



9 bit 1 - i»095 n Characters 
character 



Bit strin< 



bit 



1 bit 1 - kQ95 n bits 



Fixed binary 



bin 



1 bit 1-71 n ♦ 1 bits 



Floating binary fibin 



1 bit 1-63 n + 9 bits 



Fixed decimal dec 
( leading sign) 



9 bit 1-59 n + 1 digits 
digit 



Floating decimal fidec 



9 bit 1-59 n + 2 digits 
digit 
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EXITS STRUCTURE 



The exits structure isi 



del 1 exits* 

1 version 

2 compare 

2 input^record 
2 output_record 

Mheret 



f ixea bin init(l) « 

entry, 

entry, 

entry; 



version 



compare 



is the version number of the structure 
be 1). 



(must 



specifies the entry point of a user suoplied 
compare exit procedure. If the caller 
describes key fields (supplies a keys 
structure), then this exit must not be 
specified* 



3« input_record 



4. output_record 



specifies the entry point of a user supplied 
input_record exit procedure. This exit can 
be specified whether or not the input_file 
exit is specified. 

specifies the entry point of a user supplied 
output_record exit procedure. This exit can 
be specified whether or not the outout_file 
exit is specified. 



IO_EXITS STRUCTURE 

The io_exits structure 1st 



del 1 io_exits, 
2 version 
2 input.file 
2 output_file 

wheres 

1. version 



fixed bin init(l), 

entry, 

entry; 



is the version number of the structure (must 
be 1). 
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2. input_fHe specifies the entry point of a user supDlled 

...^,^. ._ , » , c cAi. I MI uueaure. it the caller 
names input files, then this exit must not be 
speci f ied. 

3. output.flle specifies the entry point of a user supplied 

output_file exit procedure. If the caller 
names the output file, then this exit must 
not be specified. 

ENTRY VARIABLES 

In the exits and io.exits structures, each exit point is 
specified via an entry .variable. The entry variable must be set 
feither initialized or assigned) by a user procedure, normally 
the procedure «hich calls sort_. The entry variable can identify 
either an internal entry point (that is, an internal procedure) 
or an external entry point of the procedure which sets the entry 
variable, or it can identify an external entry point of another 

If none of the exits declared in either the exits or 
io exits structure is to be used, then that structure can be 
omitted and the corresponding pointer in the array sort_desc must 
be null. If the structure is included but an exit specified in 
it is not to be used, then the corresponding entry variable must 
be set to sort_$noe xi t, which is declaredi 

del sort_$noexit entry external; 

An exit point may not be altered after the call to sort . 
Any change to ihe entry variable thereafter «ili have no effect 
However, certain entry points can be disabled, as specified in 
the descriptions of the individual exit procedures below. 
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A user supplied exit procedure is called t>y the Sort to 
perform a specified function. The user procedure must perform 
that function, and then must return to the Sort« The user exit 
procedure may perform additional functions desired by the user. 

Certain exit procedures replace the corresponding standard 
routine of the Sort. Other exit procedures supplement the normal 
functions of the Sort. This is specified for each individual 
exit procedure below. 

The following exit points are provided! 

input_file 
output_f i le 
compare 
input_record 
output_recore 

All exit points may be active during the same invocation of 
the Sort. 

The entry point names of all user supplied exit procedures 
are defined by the user. Specific names are shoinn beioM only for 
convenience in discussion. 
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INPUT.FILE EXIT PROCEDURE 

An lnput_file exit orocedura replaces th« standard input 
reading function of the Sort. The Sort calls the inout_file exit 
procedure only once during an execution of the Sort. 

An input_file exit procedure must oerforra the following 
functions For each record which is Input by the user to the 
sorting process, the input_file exit procedure must maKe one call 
to the entry sort_$re lease (described later). After the 
inout_file exit procedure has released the last Input record to 
the Sort* it must return to the Sort. 



Usage 

input_fileJ procCcode); 

del coae fixed bin(35) parameter* 

where code is a standard Multics status code <in error_tab le_) 
which must be returned by the input_file exit procedure. If the 
\/a<ue is not 0» than the Sort normally prints the corresponding 
message and returns to its cal ler with the status code 
fatal_error. (Output) 
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OUTPUT_FILE EXIT PROCEDURE 

An output_fIle exit procedure replaces the standard output 
writing function of the Sort* The Sort calls the output_file 
exit procedure only once during an execution of the Sort. 

An output_file exit procedure must perform the following 
functions* For each record which is to be retrieved in ranked 
order from the Sortt the output_file exit procedure must make one 
call to the entry point sort_$return (described later). If 
sort^Sreturn is cal led but there are no more records to be 
retrieved from the sorting process* then sort_$return returns 
with the status code end_of_info. The output_flle exit procedure 
then must return to the Sort. If the user desires* the 
output_file exit procedure may terminate retrieval at any time 
prior to receiving the end_of_info status* but it must still 
return to the Sort* (The entry sort_$return may return status 
codes other than end_of_info in case of error.) 



Usage 

output_tile* proc(codeI? 

del code fixed bin(35) parameter* 

where code is a standard Hultics status code (in error_table_) 

which must be returned by the output_tHe procedure. If the 

value is not O* then the Sort normally prints the corresponding 

message and returns to its caller with the status code 
fatal_error. (Output) 
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COMPARE EXIT PROCEDURE 

A compare exit procedure replaces the standard record 
comparison procedure of the Sort. The Sort calls the compare 
exit procedure each time the sorting process is ready to rank two 
records; that ist to determine which of the two is first in the 
sorted order. 

A compare exit procedure must perform the following 
functioni The compare exit procedure receives as arguments a 
pointer to each of the two records. The compare exit procedure 
must determine which of the two records is first - or that they 
are equal in rank - and must return a corresponding return value 
to the Sort. The compare exit procedure is invoked as a 
function. 

Usage 

compare: proc(rec_ptr_l, rec_ptr_2) returns( fixed bind)); 

del (rec_ptr_i ptr, 

rec_ptr_2 ptr) parameter? 
del result fixed bin(i); 



return(resul t) ; 

end compare? 

where* 

1. rec_ptr_l is a pointer to a douole word aligned buffer 

containing the first record of the pair to be 
compared. This record is always the first of 
the two according to the original input 
order. (Input) 

2. rec_ptr_2 is a pointer to a douale word aligned buffer 

containig the second record of the pair to be 
compared. (Input) 

3. result is the result of the comparison. (Output) 

Possible values aret 
= the two records rank equal. 
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-1 = the record pointed to by rec_ptr_i ranks 
first. 

♦1 = the record pointed to by rec_ptr_2 ranks 
first. 

If a compare exit procedure requif^es the length of either 
record, it is available in the word preceding that record in the 
form* 

del rec^len fixed bin(2i) aligned? 

A compare exit procedure cannot alter either the content or 
the length of either record. 
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INPUT_REC0R0 EXIT PROCEDURE 

An input_recorcl exit procedure may be used whether the 

Sort's standard inDut_file procedure or a user supoUed 

lnput_.file exit procedure is used, and supplements that 

input_file process. The Sort caHs the input^record exit 
procedure* 

1. Each time the ihput_file process releases a record to the 
Sort» and before that record is entered into the sorting 
process? 



Once more after the last 
the Sort (end of input)? 



input record has been released to 



Additionally* each time 
returns with an action of 



the input_record 
insert. 



exit procedure 



The Sort gives the input_record exit procedure access to the 
current record* the record about to be entered into the sorting 
process. 



An input_record exit procedure need not 
processing. If it does not, then the Sort 
current record into tiye sorting process. 



perform any 
Mill accept the 



An input_record exit procedure may perform the foHowing 
functions, which are accomplished via the values of arguments 
returned when the input_record exit procedure returns to the 
Sort! 

Accept the current record. This is accomplished by setting 
action = 0. 

Delete the current record. This is accomplished by setting 
action = 1. 

Insert one or more records before the current record. (At 
the last call to the input_record exit procedure, records 
may be inserted at the end of input.) This is accomplished 
by setting rec_ptr to point to the record to be inserted, 
setting rec_len appropriately, and setting action = 3. 

Alter the current record, before it is entered into the 
sorting process. This is accomplished by altering the 
record pointed to by rec_ptr or setting rec_ptr to point to 
another record, setting rec_len appropriately, and setting 
action = 0. 



(END) 
Page Sk 



sort sort_ 



Close the exit point so that the lnput_record exit proceoure 
wl i i not be caiied again during this execution of the Sort. 
This is accomplished by setting cl ose^exi t_s« = "1". 

The input_record exit procedure must return to the Sort each 
time it is cal led. 



Usage 

input_record» proc(rec_ptr, r€C_len, actiont cl ose_exit_SH) * 

del <rec_ptr ptr» 

rec_len fixed bin(21)» 

action fixed bin, 

close_exit_sw bitCi) ) paraojeterl 

where* 

1. rec_ptr points to a double word aligned buffer 

containing the current record. The 
input_record exit procedure may alter the 
contents of the record or may change the 
pointer to point to another record. For the 
actions of accept and insert, the Sort will 
use the value of rec_ptr returned to it by 
the input^record exit orocedure. 
(Input/Output) 

At the last cal I to the input_record exit 
procedure (end of input), there is no current 
record and rec_ptr = nu 1 1 ( > • 

2. rec_len is the length of the current record in bytes. 

The input_record exit procedure may change 
the length of the record. For the actions of 
accept and insert, the Sort will use the 
value of rec^len returned to it by the 
input_record exit procedure. (Input/Output) 

3. action indicates the action to be taKen upon return 

to the Sort. (Input/Output) 

Arguments referred to below are the values 
returned to the Sort by the input_record exit 
procedure. 
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Possible values of action aret 

= accept the current record. The record 

pointed to by rec_ptr» whose lenst^ Is 
given by rec_len, is entered Into the 
sorting process. 

Each time the inpjt_record exit procedure 
is called* the Sort sets action to this 
value* 

1 = delete the current record. The current 

record is not entered into the sorting 
process. 

3 = insert a record. The record pointed to 
by rec_ptr» Mhose length Is given by 
rec_len, is entered Into the sorting 
process. The Sort calls the lnput_record 
exit procedure again* so that the current 
record may be accepted or deleted or an 
additional record may be Inserted. At 
this next call to the input_record exit 
procedure, the current record remains the 
same. 

At the last call to the input_record exit 
procedure (end of input)* If the input_record 
exit procedure inserts records then they are 
appended at the end of Input. Any other 
value for action means do not append any 
records* and the input^record exit Mill not 
be taken again. 

cl ose_exi t_SH indicates whether the exit is to be closed 

hereafter. (Input/Output) 

Possible values arei 

"0" = keep this exit ooen. Each time the 
input_record procedure is called* the 
Sort sets c lose_exl t_sw to this value. 

"1" = close this exit. The Sort will not 
call the input_record exit procedure 
again during this execution of the Sort 
(even if the action is insert). 
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OUTPUT_RECORD EXIT PROCEDURE 

An output_recorcl €xit procedure may be used whether the 
Sort's standard output.fi le procedure or a user supplied 
output_file exit procedure is used, and supplements that 
output.file process. The Sort calls the output_record exit 
procedure! 

!• Each time it has determined the next record In ranked order 
from the merging process? 

2» Once more after the last record has been obtained froin the 
merging process tend of output)? 

3» Additionally, each time the output_record exit procedure 
returns with an action of insert. 

The Sort gives the output_record exit procedure access to 
two records! 

1. The output record, about to be written to the output file. 
(If an output_file exit procedure has been specified by the 
user, this is the record about to be returned to that exit 
procedure.) 

2. The next record, the record leaving the merging process. 

An output.record exit procedure need not perform any 
processing. If it does not, then the output record is accepted 
for the output file. 

An output_record exit procedure may perform the fol lowing 
functions, which are accomplished via the values of arguments 
returned when the input_record exit procedure returns to the 
Sort! 

Accept the ojtput record. This is accomplished by setting 
action =0. 

Delete the output record. This is accomplished by setting 
action = 1. 

Delete the record leaving the merge. This is accomplished 
by setting action = 2. 

Insert one or more records after the output record. (At the 
first call to the output_record exit procedure, records may 
be inserted at the beginning of output. At the last call to 
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the output_record exit procedure, records «ay be Inserted at 
the end of output.! This 1 <: arrnmni i ^k«h *... -_^^.__ 
rec.ptr_2 to point to the record";o "be'InJerted. H^Unl 
rec_len_2 appropriately, and setting action = 3. =>''^Ting 

Alter the output record, before it is written to the output 
file. This is accomplished by altering the record pointed 
to by rec_ptr_i or setting rec.ptr.l to point to another 
record, setting rec.len.i appropriately, and setting action 
- C to accept <or action = 3 to insert). 

Summarize data into the first record of a sequence of 
records «ith equal Keys, and d.lete the succeeding records 
of the sequence. This may be accomplished as follows: At 
the first call to the output.record exit procedure, set 
cans Va ^J^^*^^"^°\<«<»^-'-'^ey-S« = "1"). At subsejuen 
calls to the output.record exit procedure, if the output 
record and the record leaving the merge have equal keys 
(equal.key = 0). then accumulate data into the ou?purrecord 
and delete the record leaving the lerge (action = 2). if 
the two records have unequal keys (equal.key it q,, then 
accept tne output record (action = fl). - r ^ v>, rnen 

ke!'^'^^fL''^I^^f ° l^J '^^^ ''^*=°^^ °' ^ sequence with equal 
Thic^ K ''^^^^\,'^^^ preceding records of the sequence. 
This may be acconpl 1 shed as followsl At the first call /n 
the output.record exit procedure, set equal key checkina nn 
At subsequent calls. If the two records have equal kejsthe^ 
accumulate data into a work area and delete the outpu? 
record action = ll. if the two records have unequal kejs, 

Irr2./ll^!; ''''^^''^ '^^'''"^'^ ^^^"3 ^^^ accumulated data and 
accept that record (action = o). 

Sequence check the output file. This is accomplished by 
setting seq.check.sw = "i". if the output rcord will not 
collate properly with the output file, or does not have i?s 
keys in the position specified to the Sort, then set 
seq.check.sw = "o". 

Close the exit point so that the output record exit 
procedure wi | not be called again during thii execution of 
the Sort. This is accomplished by setting close exit sw = 
1 • ~ ~ 
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Usage 



output_recordi proc {rec_ptr_i» pec_len_l* rec_ptr_2, rec_len_2. 

action* equal_key» 9qual_kev_sw, 
seq_check_sw» close_,exit_sw) ; 

del (rec_ptr_i ptrt 

rec_Ien_l fixed bin(21)t 

rec_j>tr_2 ptr» 

rec_len_2 fixed bin<21>* 

action fixed bin* 

equal_Key fixed bln(i)* 

equal _key.sw bitdJt 

seq_check_s» bit(l)» 

close_exlt_s« bitd) ) parameter; 

where* 

1. rec_ptr_l points to a double word aligned buffer 

containing the ojtput record. The 
output_record exit procedure may alter the 
contents of this record or may change the 
pointer to point to another record. The Sort 
uses the value of rec_ptr_l returned to it by 
the output_record exit procedure as specified 
below in the description of the action 
argument. (Input/Output) 

At the first call to the output_record exit 
procedure (beginning of output)* there is no 
output record and rec_otr_i = nullO. 

2. rec_len_l Is the length of the output record in bytes- 

~ The output_record exit procedure may change 

the length of this record. The Sort uses the 
value of rec_len_i returned to it by the 
output_record exit procedure as specified 
below in the description of the action 
argument. (Input/Output) 

3. rec_ptr_2 points to a double word aligned buffer 

" containing the record leaving the merge. The 

output_record exit procedure may not alter 
the contents of this record. For all actions 
except insert, the Sort will ignore the value 
of rec_ptr_2 returned to it by the 
output_record exit procedure. If the action 
is Insert, then the output_record exit 
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procedure must change rec_ptr_2 to point to 
the record to be Inserted. (Input/Outout ) 

At the last call to the output_record exit 
procedure (end of outpjt) « there is no record 
leaving the merge and rec_ptr_2 = null()« 

**• rec_!en_2 is the length of the record leaving the 

merge* The output_r6cord exit procedure may 
not change the length of this record. For 
all actions except insert, the Sort Mill 
ignore the value of rec_len_2 returned to it 
by the output^record exit procedure. If the 
action is insert, then the output_record exit 
procedure must set rec_len_2 to the length of 
the record to be inserted* (Input/Output) 

5. action indicates the action to be taken upon return 

to the Sort, (Input/Output) 

Possible values of action are: 

= accept the output record. The output 

record is written to the output file. 
The Sort uses the returned values of 
rec_ptr_l and rec.len_i to identify the 
record to be written. At the next call 
to the output_record exit procedure, the 
record leaving the merge becomes the new 
output record, and a new record leaving 
the merge has been obtained. 

Each time the output_record exit 
procedure is called, the Sort sets action 
to this value. 

1 = delete the output record. No record is 

written to the output file. The Sort 
ignores the returned values of rec_ptr_l 
and rec_len_i. At the next call to the 
output_record exit procedure, the record 
leaving the merge becomes the new output 
record, and a new record leaving the 
merge has been obtained. 

2 = delete the record leaving the merge. 

(This action should be used for 
summarization into the output record.) 
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No record is written to the output file. 
At the next call to the output_record 
exit proceduret the output record remains 
the sarae» and a new record leaving the 
merge has been obtained. The Sort uses 
the returned valjes of rec_otr_l and 
rec_len_i to identify the output record 
for that next call to the output_record 
exit procedure. 

3 = insert a record after the output record. 
The output record is written to the 
output file. The Sort uses the returned 
\traiues of rec_ptr_i and rec_len_i to 
identify the record to be written. The 
Sort calls the output_record exit 
procedure again, so that the inserted 
record may be accepted or an additional 
record may be inserted. At this next 
call to the output_record exit procedure, 
the inserted record becomes the new 
output record, and fhe record leaving the 
merge remains the same. The Sort uses 
the returned val jes of rec_ptr_2 and 
rec_len_2 to identify the inserted 
record. 

At the last call to the output_record exit 
procedure (end of output), if the 
output_record exit procedure inserts records 
then they are appended at the end of output. 
Any other value for action means do not 
append any records, and the output_record 
exit will not be taken again. 

6. equal_key indicates whether the output record and the 

record leaving the merge have equal Keys. 
(Input) 

Possible values are» 

= the two records rank equal. 

tl = the two records do r\ot rank equal. At 
the first and last calls to the 
output_record exit procedure (beginning 
of input and end of input), only one 
record is present and the Sort sets 
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equal_key to this value* 

If iine user supplied key descriptions* then 
the value of equat.key is determined only by 
those key fields? the original input order 
of the two records is aat used to resolve key 
equality. If the user supplied a compare 
exit procedure* then the Sort uses the result 
of that compare exit procedure to set the 
value of equal_key« (In either case* if the 
two records rank equal then rec_ptr_i points 
to the record which is first according to the 
original input order of the two records.) 

7. equal_key_sw indicates whether or not equal key checking 

is to be performed. (Input /Output! 

possible values arei 

"O" = do not check for equal keys. At the 
first call to the output_record exit 
procedure Cbeglnning of output)* the 
Sort sets equal _key_sw to this value. 

"l" = check for equal keys before the next 
call to the output_record exit 
procedure. 

Since equal key checking takes time* the user 
should set equa l_key_sw = "1" only when 
required for actions such as summarization. 

3. seq_check_sw indicates whether or not sequence checking is 

to be performed. (Inpjt/OutputI 

Possible values aret 

"O" = do not sequence check. 

"1" = sequence check. At the first call to 
the output..record exit procedure 
(beginning of output), the Sort sets 
seq_check_sw to this value. 

Sequence checking means comparing the output 
record to the record oreviously written to 
the output file. (If the user specified an 
output_file exit procedure* the output record 
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Is compared to the record previously returned 
to that exit procedure. i Sequence ch€cking 
is performed after the output_record exit 
procedure returns to the Sort* and only if a 
record Is to be written to the output file 
(that ist only if the action is accept or 
insert). If the user supplied key 
descriptions* then the Sort's hey comparison 
routine is used. If the user supplied a 
compare exit procedure* then that exit 
procedure is called. 

If the output record is out of sequence with 
the previous record* then the status code 
fatal^error is returned to the cal ler of 
sort_; see the entry sort_ above. (If the 
user specified an outpjt_file exit procedure* 
then the status code data_seq_error is 
returned to that exit procedure? see the 
entry sort_$return below. ) 

AH records written to the output file* 
including inserted records* can be sequence 
checked. 

9. close.exit^sw indicates whether the exit is to be closed 

hereafter. (Input/Output) 

Possible values are J 

"0" = keep this exit open. Each time the 
output.record exit procedure is called* 
the Sort sets cl os€_exit_SM to this 
value. 

"1" = close this exit. The Sort will not 

call the output_record exit procedure 

again during this execution of the Sort 

(even if fh& action is insert). 
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RECORD POINTERS 

Since the Sort aligns each record in a buffer on a double 

word boundary, if an exit procedure applies a based declaration 

of the record to the pointer<s) then correct alignment is 
ensured. 



ORIGINAL INPUT ORDER <FIFO» 

For the compare and output_record exit procedures* rec_otr_l 
always points to the record whose original input order was prior 
to the record pointed to by rec_ptr_2. If a compare exit 
procedure returns with an equal ranking for the two records, then 
this original input order is preserved. Original Input order has 
been defined earlier under the heading Key Fields. 
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Eptr yt sort_$releas9 

The entry "sort_$reiease" is used each time the caller 
releases a record to the sorting process. Calls to 
sort_$rel ease are made frora a user supplied input_file procedure. 
The caller specifies the location and length of the record. The 
Sort accepts the record and stores it in its own work area. 



del sort_$release entrytptr, fixed bin<2i), fixed bin(35)); 
call sort_$rel ease (buff_ptr» rec_lent code)? 
wherei 

1. buff_ptr is a pointer to a byte alignea buffer 

containing the record. (Input) 

2. rec_len is the length of the record in bytes. 

(Input) 

3. code is a standard Muitics status code returned by 

the Sort. Possible values are listed below 
under the heading Status Codes. (Output) 



The Sort aligns each record on a double word boundary in a 
work area. 



The following status codes may be returned by the 
sort^Srelease entry point (all codes are in error_tabl e_) t 

Mormal return (no error). 

out_of_sequence The call to sort_$rel ease is not in the 

sequence required by the Sort* e.g.« 
sort_$rel ease has been called before sort_. 
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fatal_error The Sort has encounterd a fatal error during 

the sorting process* The Sort will have 
previously generated a specific error message 
and signalled the sub_error_ condition via 
the sub_err_ subroutine. 

Iong_record This input record is longer than the naximum 

supported. The record is ignored by the 

Sort, and the caller nay continue to release 
records to the Sort. 

short_record This input record is shorter than the rolnlmum 

required to contain the key fields* The 
record is ignored by the Sort, and the caller 
may continue to release records to the Sort, 
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Entry* sort.Sretiirn 

The sort_Sreturn entry Is used each time the caller 
retrieves a record» in ranked order* trow trie Sort, Calls to 
sort_$return are aiade frora a user supolied output_file procedure. 
Upon return from sort_$return, the caller is given the location 
and length of the record. 

If sort_$return is called but there are no more records to 
be retrieved, then sort_$return returns to the caller with the 
status code end_of_info. 



del sort_$return entrytptr, fixed bin(2l), fixed bin(35)»; 
call sort_$return (buff_ptr, rec_len» code)* 

where* 

!• buff_ptr is a pointer to a double word aligned buffer 

containing the record. tOutput) 

2. rec_len is the length of the record in bytes. 

(Output) 

3. code is a standard Multics status code returned by 

the Sort. Possible values are listed below 
under the heading Status Codes. fOutput) 



The Sort aligns each record on a double word boundary in a 
work area. Thus if the caller applies a based declaration of the 
record to the pointer then correct alignment is ensured. 



The following status codes may be returned by the 
sort_$return entry point (all codes are in er ror_tabl e_) ! 

fl Normal return (not er^d of information, no 

error) . 
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end_of_info Thers are no more records to be retrieved 

from the Sort. This is the normal end of 
data indication. No record is returned to 
the cai ler. 

out_of_sequence The call to sort_$retijrn is not in the 

sequence required t>y the Sort* e.g.» 

sort_$return has been called before 
sort_$release. 

fatal_error The Sort has encountered a fatal error during 

the sorting process. The Sort wiM have 
previously generated a specific error message 
and signalled the sub_error_ condition via 
the sub_err_ subroutine. 

data_loss End of data has been reached* but the number 

of records previously returned is less than 
the nufflber of records released to the Sort. 
No record is returned to the caller. 

data_gain The number of records returned (including 

this record) is noH larger than the number of 
records released to the Sort. The current 
record is returned to the callert and the 
caller may continue to retrieve records from 
the Sort. 

data_seq_error A ranking error has occurred in the records 

returned to the caller (as determined by the 
Key fields of the record). The current 
record is returned to the caller* and the 
caller may continue to request records from 
the Sort. 

Mafflfi* merge_ 

The merge_ subroutine provides a generalized file merging 
capability* which is specialized for execution by user supplied 
Darameters. The basic function of merge_ is to read one or more 
input files of records which are in order according to the values 
of one or more Key fields* merge (collate) those records 
according to the values of those key fields* and write a single 
output file of ordered (or "ranked") records. The merge_ 
subroutine has the following general capabilities! 

Input and output files may be on any storage medium and in 
any file organization* 
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Very large files* such as nult isegment files* can be merged; 

Multiple key fields and most PL/I string and numeric data 
types may be specified? 

Exits to user supplied subroutines are permitted at several 
points during the merging process. 

The arguments to the merge_ subroutine Include one or more 
pointers to additional information necessary to specialize merge_ 
for execution. This additional information is called the Merge 
Description. 



INPUT ANO OUTPUT 

The user specifies the input and output files. The Merge 
reads the input files and writes the output file. Each input or 
output file may be stored on any medium and in any file 
organization supported by an I/O module through iox_. The I/O 
module may be one of the Multics system I/O modules (such as 
tape_ansi_) , or one supplied by a specific installation, or one 
written by a user. An input or output file is specified either 
by a pathname or by an attach description. 

In all cases* records may be either fixed length or variable 
length. 



KEY FIELDS 

The user can specify the Key fields to be used in ranking 
records. Key fields are described in the Keys statement - or in 
the keys structure - of the Merge Description. Up to 20 key 
fields may be specified. Any PL/I string or numeric data type 
except complex or pictured - may be specified for a given key 
field. Ranking may be ascending, descending, or mixed. For a 
character string key field, the collating sequence is that of the 
Multics standard character set. The records of each input file 
must be in order according to those key fields. 

Alternatively* the user can supply a compare procedure, 
which is then used to rank records. The ■~ecord5 of each input 
file must be in order according to the algorithm of that 
procedure. 

The original input order of records with equal keys is 
preserved (FIFO order). Original input order is defined as 
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f o I lows! 

1. It two equal records come from dltterent input tllest then 
the record from the file which is specified earlier in the 
list of input files (in the input_ specs subroutine argument) 
is first. 

2. If two equal records come from the same input file* then the 
record which is earlier in the file is first. 



EXITS 

The Merge provides exits to user supplied procedures at 

specific points during the merging process. Exit procedures are 

named in the Exits statement - or in the exits structure - of the 
Merge Description. The following exit points are provided* 

output_record To perform special processing for each output 
record, such as deleting, inserting, or 
altering records to be output from the Merge? 
or suramari2ing data by accumulating it into a 
summary record. 

compare To compare two records? that is, to rank them 

for the merging process. 

Details of exit procedures are given below under the heading 
Writing Exit Procedures. 
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U£3J£* 



del raerge_ entry ( (*)char(*) , char(*)» {♦)ptr» 

char{*l, fixed bin(35n; 



call flier ge_ 



wherel 



!• input_specs 



( lnput_specst output_spec» merge.desct 
user_out_sw» codeJ? 



Is an array containing the specifications of 
the input files« Up to iQ input files may be 
specified. The array extent specifies the 
number of input flies* (Input) 

Input file I is specified in the array 
element input^specsCJ) » in one of the 
fol lOMing forms: 



•input_filie pathname 
■if pathname 



If an input file is in the Multlcs 

storage system and its file organization 
is either se,quential or indexed* then it 

may be specified by its pathname. The 

file may be either a single segment or a 

multisegment file. The star convention 
can not be used. 



•input_description 
•ids attach_desc 



An input file specified by a pathname 
Mill be attached using the attach 
description **vfile_ pathname". 

attach_desc 

If an input file is not in the Multics 
storage system or its file organization 
is neither sequential nor indexed* then 



it fflust be 
description, 
specified via 
must support 
opening mode 
read_record« 



soecified by an attach 

The target I/o module 

the attach description 

the sequential_inout 

and the iox_ entry point 



Pathnames and attach descriptions 
intermixed in the input_specs array. 



can be 



output_spec 



Is the specification of the output file. 
Only one output file may be specified. 
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(Input) 

The output file may be specified in one of 
the following formst 

-output_file pathname 

-of pathname If the output file is in the Mul tics 

storage system and its file organization 
is sequential* then it may be specified 
by Its pathname. The file may be either 
a single segment or a multisegment file. 

The equals convention can be used. If 
it is* it is applied to the pathname of 
the first input file and the first input 
file must be specified by a pathname* 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
descriptior "wfile_ pathname". Thus if 
the file does not exist* it will be 
created. If it does exist* it will be 
overwritten. 

-output_description attach_desc 

-oas attach_desc If the output file is not in the Mul tics 

storage system or its file organization 
is not sequential* then it must be 
specified by an attach description. The 
target I/O module specified via the 
attach description must support the 
sequential_output ooening mode and the 
iox_ entry point wrl te_record. 

merge_desc is an array of pointers to the Merge 

description. See the heading Merge 
Description below. (Input) 



^. user_out_sw specifies the destination of both the summary 

report and diagnostic messages for errors 
detected In the arguments to merge_ or in the 
Merge Description. (Input) 

This argument tray have the following valuest 



(ENO) 

Page 72 



merge_ merge_ 



"" = write the suoimary reoort and 

diagnostic messages via the I/O 
switch user_o(jtout. 

"-bf" = do not write the summary report 

and diagnostic messages. If any 
errors are diagnosed, merge_ 
will return with the status code 
bao_arg aut information about 
the nufflOer and nature of the 
errors is not available. 

switchname = write the summary reoort and 
diagnostic messages via the I/O 
switch na«ed switchname. The 
switch roust be attachea and open 
for stream output. 

5. code is a standard Multics status code returned by 

fflerge_. Possible values are listed below 
under the heading Status Codes. (Outputs 



NOTES 

Any pathname may be relative (to the user's current worfting 
directory) or absolute. 



STATUS CODES 

The following status codes may be returned by merge_ (all 
codes are in error_table_) * 

Normal return (no errors). 

bad_arg One or more arguments specified to merge_f 

including those in the Merge Oescriptior» was 
invalid or inconsistent. The Merge will have 
previously written diagnostic messages as 
directed by the user_out_sw argument. The 
merging process itself has not been started. 

fatal_error The Merge has encountered a fatal error 

during the merging process. The Merge will 
have previously generated a specific error 
message and signalled the sub_error_ 

condition via the sub_err_ subroutine. 
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out_of .sequence The call to merse_ is not In the sequence 

required by the Merge? that is* merge^ has 
been called after initiation of the Merce but 
before termination of that invdcation. 
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M££^£ Descriotiaa 

The Merge Description contains additional information to 
specialize the Merge for a particular execution. The Merge 
Description is specified via the raerge_desc argument to merge_. 
The information specified may be* 

Keys - Description of one or more Key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures* 

A Merge Description is required. As a minimum* the user 
must specify how records are to be ranked* either by describing 
key fields in the Keys statement or by naming a compare exit 
orocdure in the Exits statement. Other information in the Merge 
Description is optional. 

The Merge Description may be supol led to merge, in either of 
two forms* called source form and internal form. 

The source form of the Merge Description is written exactly 
as specified for the merge command {see the Multics Programmers* 
Manual* Commands and Active Functions* Section III), and is 
stored as an ASCII segment; that is* as an unstructured file in 
the Multics storage system. If source fom is used* then the 
merge_desc argument to merge_ must have ar» array extent of l and 
the one pointer must be a pointer to the segment. (The segment 
must contain only the Merge Description.) The source form is 
useful when the user writes the Merge Description and supplies it 
to the procedure which calls merge_. 

The Internal for« of the Merge Description is a set of one 
or two structures. The merge_desc argument must have an array 
extent of 2* and the two pointers are pointers to the two 
structures. Any of the structures can be omitted; in that case 
the corresponding pointer must be null. The pointers must be 
specified in the array in the following ordert 

addr (keys) 
addr(exits) 

where the two structures (keys and exits) a~e defined below. The 
internal form is useful when the procedure csMlnq merge_ 
constructs the Merge Description. 
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KEYS STRUCTURE 

The keys structure is used when the caHer describes key 

fields. The Merge's standard compare routine will then be used 

to rank records* If the caller describes keys» then the compare 
exit must not be specified* 

If the caller does not describe Keys» then the corresponding 
pointer in the array merge_desc must be nul I and the compare exit 
must be specified in the exits structure. The user supplied 
compare routine will then be used to rank records. 

The keys structure is* 

del 1 keySf 

2 version fixed bin init(l)» 

2 number fixed bin, 

2 key_desc(user_keys_nufflber ref ertkeys. nuraberl ) » 

3 datatype char(8)t 

3 size fixed bin(24}* 

3 Hord_offset fixed bin<i8), 

3 bit_offset fixed bin{6)» 

3 desc char(3)? 

where* 

1. version is the version number of the structure (must 

be 1) . 

2. number is the number of key fields, established by 

the value of user_keys_number. 

3. key_desc is an array of key descriptions. Each key 

description is one element of the array. The 
key descriptions must be specified in order, 
the major key first and the most minor key 
1 ast. 

«♦. datatype is the data type of the Key field. See the 

table below for the encoding of datatype. 
The value must be left lustified within 
datatype. 

5. size is the size of the key field, in units which 

depend on the data type. 

For string data types, size is the exact 
length (characters or bits) of the field. 
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Fpr arittimetlc data types* size is the 
precision (binary or decimal digits) of the 
field* The space occupied is determined by 
precision in combination with the data type. 
The space occupied is not adjusted for an 
aligned field. For example* for an aligned 
fixed binary field of one word* size must be 
specified as 35} for an aligned floating 
binary field of two Mords* size must be 
specified as 63. See the table below for the 
semantics of size. 

6. word^offset is th& word portion of the offset of the 

beginning of the key field* relative to the 
beginning of the record. Consider the record 
as being aligned on a word boundary, as will 
oe the case for a Multics PL/I structure. 
Words are numbered from O for the first word 
of the record. 



7. bit_offset 




8. desc indicates whether ranking for this key field 

is to be ascending or descending. Possible 
values arei 

••" = use ascending ranking, 
"dsc" = use descending ranking. 
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DATATYPE ENCODING AND SEMANTICS OF SIZE 



Encoding I Semantics of size 

of I (Nhere size = n) 
datatype I Unit Range Space Occupied 



Character string char 
(Multics ASCII) 



9 bit 1 - <*095 n characters 
character 



Bit string 



bit 



1 bit 1 - kQ95 n bits 



Fixed binary 



bin 



1 bit 1-71 n ♦ 1 bits 



Floating binary fJbin 



1 bit 1-63 n + 9 bits 



Fixed decimal dec 
( leading sign) 



9 bit 1-59 n + 1 digits 
digit 



Floating decimal flaec 



9 bit 1-59 n + 2 digits 
digit 
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EXITS STRUCTURE 

The exits structure is* 



del 1 exits* 

1 version 

2 compare 
2 reserved 

2 output_record 



where* 

1. version 

Z* compare 



fixed bin init(l), 

entry* 

entry init Cmerge^Jnoexlt) * 

entry; 



is tne version number of the structure (must 
be 1). 

specifies the entry point of a user supplied 
compare exit procedure. If the caller 
describes key fields (supplies a keys 
structure)* then this exit must not be 
specified* 



3. reserved 



is reserved for future use. 



k* output_record specifies the entry point of a user supolied 

output^record exit procedure. 



ENTRY VARIABLES 



In the exits structure* each exit point is speclfie 
entry variable. The entry variable must be set 
initialized or assigned) by a user procedure* norm 
procedure which calls »erge_. The entry variable can 
either an internal entry point (that is* an internal proceau 
or an external entry point of the procedure which sets the en 
variable; or it can identify an external entry point of anot 
user orocedure. 



ed via an 

(either 

al ly the 

identify 

procedure) 

try 

her 



If none of the exits declared in the exits structure is to 
be used* then that structure can be omitted and the 

corresponding pointer in the array merge^desc must be null. If 
the structure is included but an exit specified in it is not to 

be used* then the corresponding entry variable must be set to 
merge_$noexlt* which is declared* 

del raerge_$noexit entry external* 
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An exit point may not be altered after the call to merge_. 
ftny change to the entry variable thereafter mIII have no effect. 
However, certain entry points can be disabled* as specified in 
the aescriptions of the individual exit procedures beloN. 
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A user supplied exit procedure is caiied by the Merce to 
perform a specified function. The user procedure must perform 
that function, and then must return to the Merge. The user exit 
procedure may perform additional functions desired by the user. 

Certain exit procedures replace the corresponding standard 
routine of the Merge. Other exit procedures supplement the 
normal functions of the Merge. This is specified for each 
individual exit procedure below. 

The following exit points are provided* 

output_record 
compare 

AH exit points may be active during the same invocation of 
the Merge. 

The entry point names of all user supplied exit procedures 
are defined by the user. Specific names are shown below only for 
convenience in discussion. 
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COMPARE EXIT PROCEDURE 

A compare exit procedure replaces the standard record 
comparison procedure of the Merge. The Merge calls the compare 
exit procedure each time the merging process is ready to rank two 
records; that is, to determine Mhich of the two is first in the 
merged order. 

A compare exit procedure must perform the following 
function* The compare exit procedure "eceives as arguments a 
pointer to each of the two records. The compare exit procedure 
must determine which of the two records is first - or that they 
are equal in rank - ano must return a corresponding return value 
to the Merge. The compare exit procedure is invoked as a 
function. 



Usage 

comoareJ proc {rec_otr_i, rec_ptr_2) returns( fixed bin(l>); 

del (rec_ptr_l ptr» 

rec_ptr_2 ptr> parameter? 
del result fixed bind)? 



returnCresul tl ; 
end comparel 

where! 

1. rec_otr_i is a pointer to a double word aligned buffer 

containing the first record of the pair to be 
coffioared. This record is always the first of 
the two accordirg to the original input 
order. (Input) 

2» rec_ptr_2 is a pointer to a double word aligned buffer 

containig the second record of the pair to be 
compared. (Input I 

3. result is the result of the comparison. (Output) 

Possible values ares 

= the two records rank equal. 
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-1 = the record pointed to by rec_ptr_i ranks 
first. 

+1 = the record pointed to by rec_ptr_2 ranks 
first. 

If a compare exit procedure requires the length of either 
record, it is avaiiabte in the word preceding that record in the 
formi 

del rec_len fixed bin(21) aligned? 

A compare exit procedure cannot alter either the content or 
the length of either record. 
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OUTPUT^RECORD EXIT PROCEDURE 

An output_record exit procedure suDOlements the standard 
output writing function of the Merge. The Merge calls the 
output_record exit procedurel 

1. Each time it has determined the next record in ranked order 
from the merging process; 

2. Once more after the last record has been obtained from the 
merging process (end of output); 

3. Additionally* each time the output_record exit procedure 
returns with an action of insert. 

The Merge gives the output^record exit procedure access to 
two records! 

1, The output record* about to be written to the output file. 

2, The next record* the record leaving the merging process. 

An output_reGord exit procedure need not perform any 
processing. If it does not* then the output record is accepted 
for the output file. 

An outout_record exit procedure may perform the following 

functions* which are accomplished via the values of arguments 

returnee when the output_record exit procedure returns to the 

Merge* 

Accept the output record. This is accomplished by setting 
action = 0. 

Oelete the output record. This is accomplished by setting 

action = 1. 

Oelete the recoro leaving the merge. This is accoirp I i shed 
by setting action = 2. 

Insert one or more records after the output record. (At the 
first call to the output_record exit procedure* records may 
be inserted at the beginning of output. At the last call to 
the outPut_record exit procedure* records may be inserted at 
the end of output.) This is accomplished by setting 
rec_ptr_2 to point to the record to be inserted* setting 
rec_len_2 appropriate ly* and setting action = 3. 
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Alter the output record» before it is written to the output 
file. This is accofBDUshed by altering the record pointed 
to by rec_ptr^i or setting rec_ptp_i to point to another 
record* setting rec_len_l appropriately) and setting action 
= to accept (or action = 3 to insert), 

Suamarlze data Into the first record of a sequence of 
records with equal keys» and delete the succeeding records 
of the sequence. This may be accomplished as follows! At 
the first call to the output_record exit procedure* set 
equal key checking on (equal_key_sw = "l"). At subsequent 
calls to the output^record exit procedure* if the output 
record and the record leaving the merge have equal keys 
{equal_key = o>* then accumulate data into the output record 
and delete the record leavig the merge (action = 2J. If the 
two records have unequal keys (equa!_key ^0)* then accept 
the output record (action =0). 

Summarize data into the last record of a sequence with equal 
keys* and delete the preceding records of the sequence. 
This may be accomplished as followsJ At the first call to 
the output_record exit procedure* set equal key checking on. 
At subsequent calls* If the two records have equal keys then 
accumulate data into a work area and delete the output 
record (action =1). If the two records have unequal keys* 
then alter the output record using the accumulated data and 
accept that record (action =0). 

Sequence check the output file. This is accomplished by 
setting seq_check_sw = "1". If the output record will not 
collate properly with the output file* or does not have its 
keys in the position specified to the Merge* then set 
seq_check_sw = "Q". 

Close the exit point so that the outout_record exit 
procedure will not be called again during this execution of 
the Merge. This is accomplished by setting c lose_exit_sw = 



The output_record exit procedure must return to the Merge 
each time it is cal led. 



Usage 



output_record» proc(rec_ptr_i* rec_len_l* rec_ptr_2* rec_len_2» 

action* equal_key* equal_key_sw, 
5eq_check_sw* close_exi t_sw) ; 
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del (rec_ptr_i Dtrt 

rec_len_l fixed bin(21>» 

rec_ptr_2 ptr, 

rec_len_2 fixed bin(21)f 

action f ixed bin, 

equal.Key fixed bin(i) , 

equal_key_3w bit<lJ» 

seq_check_5«i bitd)* 

c I ose_exit_s« bitd) ) parameter? 



where* 

1« rec_ptr_l ooints to a double word aligned buffer 

containing the output record. The 
outout^record exit procedure may alter the 
contents of this record or may change the 
pointer to point to another record. The 
Merge uses the value of rec_ptr_l returned to 
it by the output _record exit procedure as 
specified beJot* in the description of the 
action argument. (Input/Output) 

At the first call to the output_record exit 
procedure (beginning of output)* there is no 
output record and rec_ptr_,i = nullO. 

2. rec_len_i is the length of the output record in bytes. 

The output_record exit procedure may change 
the length of this record. The merge uses 
the value of rec_len_l returned to it by the 
output_record exit procedure as specified 
below in the description of the action 
argument. (Input/Outout) 

3. rec_ptr_2 ooints to a double word aligned buffer 

containing ihe record leaving the merge. The 
output_record exit procedure may not alter 
the contents of this record. For all actions 
except insert* the Merge will ignore the 
value of rec_ptr_2 returned to it by the 
output_record exit procedure. If the action 
is insert, then the output_record exit 
procedure must change rec_ptr_2 to point to 
the record to be inserted. (Input/Output) 

ftt the last call to the output_record exit 
procedure (end of output), there is no record 
leaving the merge and rec_ptr_2 = null(>. 
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ko rec_len_2 is the length of the record leaving the 

merge* The output_record exit procedure may 
not change the length of this record. For 
alt actions except insert, the Merge will 
ignore the value of rec_len_2 returned to it 
by the output_record exit procedure. If the 
action is insert, then the output_record exit 
orocedure must set rec^len_2 to the length of 
the record to be inserted. (Input/Outout) 

5. action indicates the action to be taken upon return 

to the Merge. (Input/Output) 

Possible values of action arei 

3 = accept the output record. , The output 
record Is written to the output file. 
The Merge uses the returned values of 
rec_ptr_i and rec_l€n_i to identify the 
record to be written. At the next call 
to the output_record exit procedure, the 
record leaving the merge becomes the new 
output record, and a new record leaving 
the merge has been obtained. 

Each time the output_record exit 
procedure Is called, the Merge sets 
action to this valje. 

1 - delete the output record. No record is 

written to the output file. The Merge 
ignores the returned values of rec_ptr_i 
and rec_len_i. At the next call to the 
output_record exit procedure, the record 
leaving the merge becomes the new output 
record, and a new record leaving the 
merge has been obtained. 

2 = delete the record leaving the rrerge. 

(This action should be used for 
summarization into the output record.) 
No record is written to the output file. 
At the next call to the output_record 
exit procedure, the output record remains 
the same, and a new record leaving the 
merge has been obtained. The Merge uses 
the returned values of rec_ptr_l and 
rec_len_i to identify the output record 
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for that next cal I to the output_r6Cord 
exit procedure. 

3 = insert a record after the output record. 
The output record is written to the 
output file. The Merge uses the returned 
values of rec_otr_l and rec_Ien_l to 
identify the record to be written. The 
Merge calls the output^ record exit 
procedure again* so that the inserted 
record may be accepted or an additional 
record fliay be inserted. At this next 
call to the output^record exit proceduret 
the inserted record becomes the new 
output record* and the record leaving the 
merge remains the same. The Merge uses 
the returned values of rec_pt_2 and 
rec_len_2 to identify the inserted 
record. 

At the last call to the output_record exit 
procedure (end of output)* if the 
output_record exit procedure inserts records 
then they are appended at the end of output. 
Any other value for action means do not 
append any records* and the output_record 
exit will not be taken again. 

6. equal_key indicates whether the output record and the 

record leaving the merge have equal keys. 
(Input) 

Possible values arei 

= the two records rank equal. 

+1 = the two records do not rank equal. At 
the first and last calls to the 
output_record exit procedure* (beginning 
of output and end of output)* only one 
record is present and the Merge sets 
equal_key to this value. 

If the user supplied Key descriptions* then 
the value of equal^key is determined only by 
those key fields; the original input order 
of the two records is aat used to resolve key 
equality. If the user supplied a compare 
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exit procedure, then the Merge uses the 
result of that compare exit procedure to set 

the value of equal_key# (In either case. If 

the two records rank equal then rec_ptr_i 

points to the record wrtich is first according 

to the original inpjt order of the two 
records.) 

7« equal_key_sw indicates whether or not equal key checking 

is to be performed. (Input/Output) 

Possible values arei 

"0" = do not check for equal keys. At the 
first call to the output_r€Cord exit 
procedure (beginning of output) , the 
Merge sets equal _key_st* to this value. 

"l" = check for equal keys before the next 
call to the output_record exit 
procedure. 

Since equal key checking takes time, the user 
should set equa l_key_sw = •"i" only when 
required for actions such as summar izat ion. 

8. seq_check_sw indicates whether or not sequence checking is 

to be performed. (Input/Output) 

Possible values are* 

"0** - do not sequence check. 

"1" = sequence check. At the first call to 
the output _record exit procedure 
(beginning of output), the Merge sets 
seq_check_sw to this value. 

Sequence checking means comparing the output 
record to the record previously written to 
the output file. Sequence checking is 
performed after the output_record exit 
procedure returns to the Merge, and only it a 
record is to be written to the output file 
(that is, only if the action is accept or 



insert). If the user supoliea key 
descriptions, then the Merge's key comparison 
routine is used. If the user supoliea a 
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compare exit procedure* then that exit 
procedure Is called* 

If the output record is out of sequence with 
the previous record* then the status code 
fatal_error is returned to the caller of 
nerge.; see the entry nierge_ above. 

All records written to the output tile* 
including Inserted records* can be sequence 
checked. 

9. cl ose_exi t_s« indicates whether the exit is to be closed 

hereafter. (Input/Outpjt ) 

Possible values are» 

"0" = keep this exit open. Each time the 
output^record exit procedure is called* 
the Merge sets c 1 ose_exit_sw to this 
value. 

"l" = close this exit. The Merge will not 

cal I the output_record exit proceaure 

again during this execution of the 

Merge (even if the action is insert*. 
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RECORD POINTERS 

Since the Merge aligns each record In a buffer on a double 
word boundarvt i f an exit procedure applies a based declaration 
of the record to the pointer(s) then correct alignment is 
ensured. 



ORIGINAL INPUT OROER (FIFO) 

For the coapare and output_record exit procedurest rec_Dtr_i 
always points to the record whose original input order was prior 
to the record pointed to by rec_ptr_2. If a compare exit 
procedure returns with an equal ranking for the two records* then 
this original input order is preserved. 9riginal input order has 
been defined earlier under the heading Key Fields. 
Mais* sort 

The sort command is described in the Multics Programmers' 
Hanualt Commands and Active Functions, Section III. This 
description includes only additional optional control arguments 
which are not described in MPM Commands. 



sort input„specs output_specs control _args 
wheret 

3. control_args can be chosen from the following (in addition 

to those control arguments specified in MPM 
Commands) t 

-time prints timing information for the Sort* 

System load !hmu) 
Merge order 
String size 
and for each phase of the Sort* 
Elapsed time 
Real CPU time 
Virtual CPU time 
Page f aul ts 
Paging device faults 
Comparisons executed 

(Times are given in seconds.) 
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-merge_order i specifies that the merge order Is to be 

m. The argument a must be a decimal 
integer. This argument is meaningful 
only if all Input files are in the 
Storage Systewt so that the total input 
file size can be obtained by the Sort. 

-string_size £ specifies that the string size (as 

produced during the presort) is to be s 
bytes. The argument 5. must be a decimal 
integer* and must be less than the 
system raaximura segment size. The actual 
size of any string may differ somewhat 
from s., since the length of the last 
record Inserted Into the string may not 
exactly match the soace available. 

Merge order and str ing size cannot both be 
sped f led. 

-debug specifies that temporary files will be left 
initiated (but truncated to zero lengthl 
after completion of the Sort. This argument 
is intended for use with performance 
measurement and analysis tools which print 
reference names* such as sample_refs. 

If this argument is omitted* temporary files 
will be deleted after completion of the Sort. 

If -debug is specified* deletion of temporary 
files must be done explicitly by the user. 
Some temporary files are in the process 
directory; the work files are in the 
directory specified by the -temo_dir 
argument. The names of all temporary files 
are generated uniquely for each invocation of 
the Sort* and always contain the string 
"sort.". 
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Haas* merge 

The merge command Is described in the Multics Programmers* 
Manual* Commands and Active Functions, Section III. This 
description includes only additioral optional control arguments 
which are not described in MPM Commands. 



merge input_specs output_specs coitrol_args 

Hheret 

3« contPol_args can be chosen from the following (in addition 

to those control arguments specified in MPM 
Commands) i 

-time prints timing information for the Merge* 

System load <hmu) 
and for each phase of the Merges 
Elapsed time 
Real CPU time 
Virtual CPU time 
Page faul ts 
Paging device faults 
Comparisons executed 

(Times are given in seconds.* 

-debug specifies that temporary files will be 

left initiated (but truncated to zero 
length) after completion of the Merge. 
This argument is intended for use with 
performance measurement and analysis 
tools which print reference names* such 
as sample_refs. 

If this argument is omitted* temporary 
files will be deleted after completion 
of the Merge. 

If -debug is specified, deletion of 
temporary files must be done explicitly 
by the user. All tamoorary files are in 
the process directory. The names of all 
temporary files are generated uniquely 
for each invocation of the Merge* and 
always contain the string "sort_". 
(END) 
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Mflfflfi' sort_ 

The sort_ subroutine is descrioed in the Mul tics 
Programmers* Manualf Subroutines, Section II. This description 
includes only additional entry points which are not described in 
MPM Subroutines. 



Ent rvi sort_$inltiate 

The sort_$ini tiate entry point is used when the Sort is 
"driven" by its caller. The Sort is said to be driven if the 
caller supplies a procedure which calls (or directly performs} 
the input file processing and outout file processing procedures. 
Such a driver must have the following general foriBi 

call sort_$lnitiate(arguments> » 

call inout_f i I e_proc (code) ? 

call sort_Scommence(code) ; 

call output_f i I e_proc tcod€) 1 

call soPt_Jterminate{code) ; 

where* 

1. sort_$initlate is the procedure of the Sort which must 

be called first (it "initiates" the 
Sort). 



lnput_f i le_proc 



sort_$commence 



<+. outout_f i le_proc 



is an input_file procedure* as specified 
in the description of the sort_ 
subroutine in HPH Subroutines. Instead 
of calling an inogit_file procedure, the 
driver may perform the necessary 
functions directly. 

is the procedure of the Sort which must 
be called when the input_file procedure 
has completed releasing records to the 
sorting process (it "commences" the 
merging process). See the entry 

sort_$commence below. 

is an output^file procedure, as 

specified in the description of the 
sort_ subroutine in MPM Subroutines. 
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Instead of calling an output_flle 
procedure, the driver may perforn! the 
necessary functions directly. 

5. soPt_$ter!5inate is the procedure of the Sort which must 

be called when the outDut_flle procedure 
has completed retrieving recoros from 
the Sort (it "terminates" the sorting 
process). See the entry sort_$terminate 
below. 

The entry points sort_$lni tiate, sort_$commence, and 
sort_$terminate are specifically designed to be used by COBOL 
object programs. They support the ANSI COBOL Sort/Merge Module, 
Level 2 (the SORT, RELEASE, and RETURN statements). 

Normally, when called as a command (sort) or as a subroutine 
(sort_), the Sort itself contains the driver to perform the five 
cat Is listed above. 

del sort_$initiate entry(char (*), ptr, ptr, 

char(»), float bin(27), fixed bin(35)); 

call sort_$initiate( temp_dir, keys_ptr, exits_ptr, 

user_out_sw, file^size, code); 

where* 

1. temp_dlr is the pathname of the directory which will 

contain the Sort's work files. If this 
argument is "", then work files will be 
contained in the user*s process directory. 

This argument should be used when the process 
directory will rot be large enough to contain 
the work files. The get_dir_ functior may be 
used to obtain the name of the user's current 
working directory. (Input) 

2. keys^ptr is a pointer to the keys structure, which 

describes the key fields to be used for 
ranking records. This structure is identical 
to that specified under the heading Keys 
Structure in the description of the sort_ 
subroutine in MPM Subroutines, Section II. 
If the user is supplying a compare exit 
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orocedure» then keys^ptr must be null and the 
compare procedure must be specified in the 
exits structure. (Input) 

3, exlts_ptr is a pointer to the exits structure» which 

specifies which exit points are to be used 
and gives the entry point names of the 
corresponding user supplied exit procedures. 
This structure is identical to that specified 
under the heading Exits Structure in the 
description of the sort_ subroutine in MPM 
SubroutineSf Section II. If no exits are to 
be used, then exits_ptr must be null. If the 
compare exit is specified* then keys must not 
be described. (Input) 

I*, user_out_sw specifies the destination of both the Sort's 

summary report and diagnostic messages for 
errors detected in the arguments to 
sort_5inltiate. (Input) 

This argument may have the following valuesi 

"" = write the summary report and 
diagnostic messages via the I/O 
switch user_output. 

"-bf" = do not write the summary report 
and diagnostic messages. If any 
errors are diagnosed* 
sort_$lnlt late will return with 
the status code bad_arg but 
Information about the number and 
nature of the errors Is not 
aval lable. 

switchname = write the summary report and 
diagnostic messages via the I/O 
switch named switchname. This 
switch must be attached and open 
for strea« output. 

5. file_size is the total amount of data to be sorted* in 

millions of bytes. If this argument is zero, 
the default assumption is approximately one 
million bytes (flle.size = 1.0). (Input) 

The file^size argument is used for 
optimization of performance? the actual 
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amount of data can be considerably larger 
without preventing the Sort from completing. 
The niaxifflum amount of data which can be 
sorted is (in bytesJ approximately 60 million 
times the square root of fil€_size. 

&• code is a standard Multics status code returned by 

sort_$initiate. Possible values are listed 
below under the heading Status Codes. 
COutput) 



Entry variables in the exits structure should be set 
(either initialized or assigned) by the procedure which calls the 
sort_initiate entry point. 

In order that the Sort can be terminated properly in case of 
an abnormal exit, the cleanup procedure of the caller of 
sort_$initiate must include a call to the entry point 
sort_$ terminate. 

The following status codes may be returned by sort_$init iate 
(all codes are in error_table_) » 

Normal return (no errors). 

bad_arg One or more arguaents specified to 

sort_$initiate» including the keys and exits 
structures, was invalid or inconsistent. The 
Sort will have previously written diagnostic 
messages as directed by the user_out_sw 
argument. The sorting process itself has not 
b&en started. 

fatal_error The Sort has encountered a fatal error. The 

Sort will have previously generated a 
specific error message and signalled the 
sub_error_ condition via the sub_err_ 
subroutine. "" 
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out_of_sequence The call to sort_$initiate is not In the 

sequence required toy the Sort? e*g»« 
sort_$initiate has been called after 
initiation of the Sort but before normal 
termination of that invocation via a call to 
sort^J terminate. 



Page 98 



Sort/Herge PLM Sort/Merge PLM 



£otr¥* sort^Jcommence 

The sort_$coinmence entry point must be called after the 
driver of the Sort has completed Its input.file procedure. See 
the entry point sort^Sini tlate above* The call to sort_$comnience 
informs the Sort that end of input has been reached. Upon return 
from sort_$commence» the driver can oegin its output_flle 
procedure. "* 

del sort_$comfflence entry(fixed bin<35)); 

call sort_$commencelcode) * 

where code is a standard Hultics status code returned by 
sort_$co«raence. Possible values are listed below under the 
heading Status Codes. (Output) 

The foi lowing status codes may be returned by sort^Jcomnience 
<all codes are in error_table_) * ~ 

Normal return (no errors) • 

fatal_error The Sort has encountered a tatal error during 

the sorting process. The Sort will have 
previously generated a specific error Message 
and signalled t^e sub_error_ condition via 
the sub_err_ subroutine. 

out_of_sequence The call to sort_$com«ence is not in the 

sequence required by the Sort? e.g.» 

sort_$comnience has been called before 
sort_$initiate. 
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£Q±n^» sort_$terminate 

The sort_Jterminate entry point must be called after the 
driver of the Sort has completed its output^file procedure. See 
the entry point sort_$lnit iate aoove. The call to 
sort_$terminate Inforsis the Sort that the current execution of 
the Sort is complete. Upon return from sort_$terfflinate» the 
caller can initiate another execution of the Sort, 



del sort_$tepininate entryCflxed bin(35n; 

call sort_$terminate(code) » 

Mhere code is a standard Multtics system status code returned by 
sort_$terminate. Possible values are listed belOM under the 
heading Status Codes. (Output) 

The folloMing status codes may be returned by 
sort_$terminate (all codes are in error_tab le_) * 

3 Normal return (no errors!. 

out_of_sequence The call to sort_$terMinate is not in the 

sequence required by the Sort; e.g.t 

sort_$termlnate has been called before 
sort_$initiate. 
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